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Abstract 


Linear  prediction  filtering  techniques  have  been  used  in  studying  the  coupling 
processes  between  the  solar  wind  and  magnetosphere.  The  magnetosphere  is  a  complex, 
dynamic  system  with  at  least  two  independent  coupling  methods  for  energy  input,  driven 
and  unloading.  Linear  models  were  built  and  tested  on  the  Bargatze  data  set,  consisting  of 
over  70  days  of  geomagnetic  indices  and  solar  wind  data  ordered  in  34  intervals  of 
increasing  geomagnetic  activity.  Linear  filtering  techniques  employing  single-  and 
multiple-input,  autoregressive  models  predicted  values  of  the  magnetic  index  AL  from 
solar  wind  data.  The  impulse  response  curves  of  the  AL-coupling  fianction  groups  showed 
amplitude  peaks  at  25  and  70  minutes,  confirming  results  in  previous  studies.  The 
separate  peaks  indicate  responses  corresponding  to  the  driven  and  unloading  time  scales. 
The  average  correlation  coefficients  generated  between  predicted  AL  values  and  the 
measured  values  of  AL  were  0.665,  0.738,  and  0.793  for  single,  dual,  and  triple  input 


models,  respectively. 


APPLICATION  OF  AUTOREGRESSIVE  MOVING  AVERAGE  LINEAR 
PREDICTION  FILTERS  TO  THE  CHARACTERIZATION  OF 
SOLAR  WIND-MAGNETOSPHERE  COUPLING 


I.  Introduction 


Background 

In  the  last  three  decades  of  solar  wind-magnetosphere  coupling  research, 
geomagnetic  activity  has  been  indirectly  linked  to  variations  in  the  solar  wind.  In  general, 
changes  in  components  of  the  solar  wind  (bulk  solar  wind  speed.  Interplanetary  Magnetic 
Field  (IMF)  strength  and  orientation,  etc.)  cause  changes  in  the  geomagnetic  activity 
levels.  The  exact  relation  defining  the  interaction  between  the  solar  wind  and  the 
magnetosphere  is  currently  unknown.  Several  theories  have  been  proposed,  but  none  are 
able  to  account  for  all  of  the  observed  phenomena. 

Due  to  the  complexity  of  the  detailed  physical  relationship  between  solar  wind 
components  and  geomagnetic  activity  levels,  empirical  techniques  have  been  used  to  relate 
the  two.  Correlative  studies  [Gonzalez,  1990]  have  established  “important”  solar  wind 
parameters.  When  adopted  as  “inputs”  in  a  linear  predictive  filter  analysis  [Bargatze  et  al, 
1985;  Clauer,  1986]  or  “states”  in  a  state-input  space  approach  [Vassiliadis  et  al.,  1995], 
geomagnetic  activity,  in  terms  of  the  magnetic  indices,  may  be  forecast  with  a  high  degree 
of  success.  These  system  approaches  additionally  provide  insight  on  energy 


transfer/relaxation  time  scales  and  permit  assessment  of  simple  models  of  solar  wind- 
magnetosphere  coupling. 

The  solar  wind  is  an  extension  of  the  Sun’s  corona.  The  Sun’s  activity,  and  hence 
the  solar  wind’s  characteristics,  are  primarily  chaotic  in  nature,  with  only  a  minimum  of  its 
features  being  predictable.  If  a  forecast  of  the  geomagnetic  indices  is  desired,  solar  wind 
data  retrieved  from  a  point  upstream  of  the  Earth  (nearer  to  the  Sun)  is  required.  The 
WIND/Solar  Wind  Interplanetary  Measurements  (WIND/SWIM)  satellite,  launched  by 
National  Aeronautics  and  Space  Administration  (NASA)  in  November  of  1994,  provides 
data  on  the  components  of  the  solar  wind  (Figure  1).  In  May  1997,  the  satellite  will  enter 
a  permanent  orbit  at  the  Lagrange  point,  named  “LI”,  located  at  approximately  1%  of  the 
Sun-Earth  distance  upstream.  From  this  distance,  data  recorded  by  the  satellite  provides 
researchers  approximately  one  hour  lead  time  before  coupling  actions  are  initiated 
between  the  solar  wind  and  magnetosphere. 

Problem  Statement 

The  50th  Weather  Squadron  (50  WS),  located  at  Falcon  AFB  in  Colorado  Springs, 
prepares  and  disseminates  forecasts  and  information  on  the  solar  and  geomagnetic  activity 
levels.  The  standard  forecast  is  persistence,  i.e.,  forecasting  current  conditions  to  continue 
into  the  future.  The  50  WS  has  currently  begun  implementing  and  testing  the 
Magnetospheric  Specification  Model  (MSM),  a  model  that  describes  the  current  state  of 
the  magnetosphere.  The  model  uses  various  inputs  but  can  be  run  with  a  single  input,  the 
magnetic  index  Kp. 
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In  the  near  future,  the  Magnetospheric  Specification  Forecast  Model  (MSFM)  will 
begin  testing.  Until  the  MSFM  (a  true  forecast  model)  is  fully  implemented,  the  MSM  (a 
“nowcasf  ’  model)  using  a  predicted  Ap  will  provide  the  only  predictive  capability  to  the 
forecasters.  Thus,  the  development  of  a  model  which  can  predict  Ap  would  greatly 
enhance  the  capability  of  the  MSM  and  provide  a  better  prediction  of  geomagnetic  activity 
than  persistence. 


WIND  spacecraft  and  instrOTeh^ 


TGRS 


WAVES 


KONUS 
EPACT 


SMS,  not  shovyn,  is  on  the  far  side 
of  the  spacecraft.  Second  sensor 
for  KONUS  is  obscured  on  the  bottom ; 
of  the  spacecraft. 


Figure  1.  WBVD/SWIM  Satellite.  (Courtesy  of  the  Space  Environment  Lab,  Boulder.) 


Research  Objectives 


The  initial  research  objective  of  this  thesis  will  be  to  reproduce  the  results  of 
Clauer  [1986],  applying  linear  prediction  filter  techniques  to  solar  wind-magnetosphere 
coupling.  Specifically,  using  the  Bargatze  data  set  [Bargatze,  1985],  a  5-interval, 
moving-average  filter  will  be  constructed  to  predict  AL  from  VBs,  the  product  of  the  bulk 
solar  wind  speed  and  the  rectified  vertical  component  of  the  solar  wind’s  magnetic  field. 

The  results  will  be  compared  with  previous  works  and  actual  measurements  of  the 
AL  index.  Multiple  input  models  will  then  be  developed  to  determine  possible 
improvement  in  the  predictive  capability  of  linear  filter  models.  The  results  of  the  multiple 
input  models  will  then  be  contrasted  with  a  state-input  space  model  [Vassiliadis,  1995]. 

Data  Source 

The  data  base  for  this  study  will  be  the  Bargatze  data  set.  This  data  set  is 
comprised  of  data  recorded  from  November  1973  to  December  1974  and  is  the  same  data 
set  used  by  Clauer  [1986]  and  Vassiliadis,  et  al.  [1995].  The  data  set  consists  of  the  AL 
and  AE  indices  as  well  as  solar  wind  plasma  and  IMF  measurements  at  2.5  minute  time 
resolution.  The  plasma  and  IMF  observations  were  averaged  over  2.5  minute  periods  for 
comparison  with  the  AL  index  values,  which  were  obtained  from  the  World  Data  Center, 
Boulder,  Colorado  [Clauer,  1986]. 

There  are  34  time  intervals  included  in  the  Bargatze  data  set,  ordered  from  low- 
to  high-level  activity,  and  encompassing  73  days  of  data.  The  intervals  are  at  least  one  day 
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long,  but  shorter  than  four  days.  Each  interval  is  temporally  bounded  at  both  ends  by  a  2- 
hour  segment  of  small,  nearly  zero  solar  wind  input  and  AL  index  values  [Bargatze,  1985], 


II.  Literature  Review 


In  this  chapter,  the  relevant  background  information  will  be  given  in  five  sections. 
The  first  section  will  discuss  the  physics  of  the  solar  wind,  magnetosphere,  and  solar  wind- 
magnetosphere  coupling.  The  second  portion  will  introduce  the  relevant  geomagnetic 
indices,  discuss  how  they  are  formed,  and  identify  the  geographic  latitudes  where  their 
areas  of  greatest  influence  lie.  The  third  section  will  give  a  more  detailed  description  of 
the  WIND/SWIM  satellite.  The  fourth  section  will  review  the  approaches  and  findings  of 
Bargatze  et  al.  [1985],  Clauer  [1986],  and  Vassiliadis  et  al.  [1995].  The  last  section  will 
introduce  and  explain  the  criteria  for  assessing  the  quality  and  accuracy  of  the  overall 
output. 

Solar  Wind  and  Magnetosphere 

The  sun’s  atmosphere  does  not  have  a  well-defined  exterior  boundary.  The  solar 
corona  expands  into  space,  and  is  referred  to  as  the  solar  wind.  The  solar  wind  is  a  fully- 
ionized,  quasi-neutral  plasma,  consisting  primarily  of  electrons  and  protons,  but  including 
some  traces  of  heavier  ionized  particles  [Parks,  1991].  The  solar  wind  plasma,  by 
definition  an  excellent  conductor,  carries  the  corona’s  magnetic  field  with  it  as  it  leaves  the 
sun’s  atmosphere.  As  the  solar  wind  convects  outward  to  space,  it  stretches  the  sun’s 
open  magnetic  field  lines  outward,  creating  the  IMF  [Tascione,  1994]. 

The  solar  wind’s  parameters,  i.e.,  the  bulk  velocity,  IMF  strength,  and  IMF 
orientation,  are  the  primary  determinants  of  geomagnetic  activity.  Characteristic  values 
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of  the  bulk  velocity  and  IMF  strength  range  between  300  and  1000  km-s’^  and  ones  to  tens 
of  nanotesla  for  the  vertical  component,  respectively.  The  values  of  the  horizontal  IMF 
components  are  much  stronger,  ranging  from  tens  to  a  few  hundred  nanotesla.  These 
variations  occur  primarily  because  of  Coronal  Mass  Ejections  (CME),  Coronal  Holes 
(CH),  and  solar  flares.  Solar  flares  and  CMEs  are  sporadic  in  nature  and  provide  short¬ 
lived  bursts  of  energy  to  the  solar  wind.  Coronal  holes  are  relatively  long-lived  features, 
and  may  influence  geomagnetic  activity  for  several  solar  rotations  [Tascione,  1994].  High 
solar  wind  speed,  high  IMF  strength,  and  southward-oriented  IMF  field  lines  are,  in 
combination  or  individually,  the  major  causes  for  the  largest  variations  in  geomagnetic 
activity,  geomagnetic  storms. 

Geomagnetic  storms  are  divided  into  2  primary  classes:  substorms  and  major 
storms.  Substorms  occur  on  a  time  scale  of  1  to  3  hours  and  are  generally  less  than  half  as 
intense  as  the  major  storms.  The  major  storms  are  typically  longer  in  duration,  lasting 
from  3  hours  to  7  days,  with  intensities,  expressed  in  terms  of  magnetic  deviations  from 
the  norm,  averaging  larger  than  -50  nT  (where  the  more  negative  the  measurement  of  the 
storm  is,  the  greater  its  strength).  The  true  source  of  a  substorm  is  still  unresolved; 
however,  some  causes  ascribed  in  current  theories  are  a  sudden  change  of  orientation  in 
IMF  from  northward  to  southward,  a  sharp  change  in  solar  wind  speed  or  density,  or  even 
a  spontaneous  event  with  no  apparent  cause  [Hargreaves,  1992:200].  Major  storms  are 
almost  always  related  to  significant  activity  on  the  surface  of  the  Sun.  Sunspot  groups, 
large  coronal  holes,  and  high  solar  flare  activity  over  an  extended  period  of  time  are  some 
causes  of  major  storms  and  high  geomagnetic  activity. 


7 


The  solar  wind  experiences  fluid-like  shocks,  due  to  acceleration  of  the  plasma  as 
it  leaves  the  solar  corona.  This  was  first  proposed  in  E.N.  Parker’s  dynamic 
model  of  the  sun  and  solar  wind  [Parker,  1958],  The  speed  of  the  solar  wind  averages 
approximately  400  km  s'\  which  is  5  to  10  times  greater  than  supersonics  speeds  in  space. 
As  the  solar  wind  expands  radially,  a  slow-moving  region  may  be  compressed  by  a  fast- 
moving  region  ejected  after  it.  The  diffusion  of  magnetic  fields  between  the  regions  does 
not  occur  over  the  time  they  take  to  reach  the  Earth,  so  a  discontinuity  develops  between 
the  fast-  and  slow-moving  regions.  As  there  are  no  actual  collisions  between  particles  or 
the  regions  containing  them,  the  shocks  are  termed  collisionless  [Parks,  1991:5].  It  is 
through  collisionless  interactions  that  the  energy  stored  in  magnetic  fields  is  transferred  to 
the  plasma. 

The  magnetosphere’s  shape  is  the  result  of  the  interaction  between  the  Earth’s 
magnetic  field  and  the  solar  wind  (Figure  2).  As  the  solar  wind  streams  past,  it  pushes  (via 
magnetic  pressure)  and  couples  with  the  Earth’s  magnetic  field  lines  (via  magnetic  field 
line  merging)  allowing  the  solar  wind’s  energy  to  be  transferred  to  the  magnetosphere. 

This  is  why  the  shape  of  the  Earth’s  magnetosphere  is  asymmetric,  a  small  bubble  on  the 
sunward  side,  upstream,  and  an  extended  tail  in  the  anti-sunward  direction,  downstream. 

For  the  purposes  of  this  thesis,  the  relevant  features  of  the  magnetosphere  are:  the 
bow  shock,  magnetopause,  plasmasphere,  magnetotail,  and  the  cusps.  Distance 
measurements  in  the  magnetosphere  are  usually  in  units  of  Earth  radii  (Re).  The  bow 
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magnetosheath 
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Figure  1  The  Earth’ s  Magnetosphere. 
(Parks  [1991:8]) 


shock  is  understood  to  be  similar  in  nature  to  an  aerod5mamic  shock  ahead  of  a  blunt 
body.  The  solar  wind  cannot  pass  through  the  magnetosphere  so  it  must  slow  down, 
transfer  energy  to  the  magnetosphere,  and  then  be  pushed  around  it  by  the  subsequent 
solar  wind.  The  magnetopause  is  the  boundary  between  the  solar  wind  and  the 
magnetosphere  and  is  generally  located  at  10  Re  upstream  of  Earth.  The  plasmasphere  is 
the  lower  portion  of  the  magnetosphere  surrounding  Earth  (3  to  4  Re)  which  corotates 
with  the  neutral  atmosphere  and  ionosphere.  The  cusps  (or  clefts)  are  narrow  regions 
encompassing  magnetic  field  lines  open  to  the  solar  wind  plasma  and  extending  down  from 
the  high  latitude  magnetopause  to  the  polar  ionosphere.  They  indicate  the  primary  areas 
for  solar  wind  plasma  injection  into  the  ionosphere. 

There  are  also  current  systems  in  the  magnetosphere.  These  currents  are 
generated  as  the  ions  and  electrons  traveling  along  the  magnetopause  boundary  or  within 
the  magnetosphere  slowly  drift  across  geomagnetic  field  lines.  The  variations  in  the 
magnetospheric  current  systems  (which  the  geomagnetic  indices  reflect)  affect  the  current 
systems  in  the  ionosphere.  Figures  3  and  4  show  the  major  magnetospheric  current 
systems  and  their  effects  on  the  current  systems  of  the  Earth’s  ionosphere. 

The  magnetospheric  current  systems  are:  the  magnetopause  current,  ring  current, 
neutral  (or  cross-tail)  current  sheet,  and  the  field-aligned  current.  The  current  found  at  the 
magnetopause  is  created  by  the  amount  of  mass  and  energy  transmitted  across  the 
magnetopause.  According  to  Tascione  [1994],  observations  of  this  boundaiy  indicate  that 
approximately  0.5%  of  the  solar  wind’s  mass  incident  on  the  dayside  of  the  magnetosphere 
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passes  through  the  magnetopause.  The  ring  current  encircles  the  geomagnetic  equator, 
and  is  typically  located  between  3  to  6  Re  .  It  involves  a  slow  drift  of  particles  across  the 
magnetic  field  lines.  The  neutral  sheet  current  system  separates  the  oppositely  directed 
magnetic  fields  emanating  from  the  north  and  south  polar  caps.  The  undisturbed  neutral 
sheet  is  about  5  Re  thick  and  has  an  identifiable  inner  edge  near  7  Re  on  the  anti-sunward 
side  The  field-aligned  currents  (also  called  Birkeland  currents)  flow  parallel  to  the  Earth’s 
magnetic  field  into  the  high-latitude  auroral  oval.  These  currents  produce  a  magnetic 
perturbation  in  a  direction  perpendicular  to  the  Earth’s  magnetic  field  and  generate  a  small 
current  by 

HoJll  =(Vx5B)||  (1) 

where  p.o  is  the  magnetic  permeability  of  free  space,  J|  |  is  the  Birkeland  current  density, 
and  6B  is  the  magnetic  perturbation  perpendicular  to  the  Earth’s  magnetic  field  [Kelley, 
1989:296-7].  The  magnetic  perturbation  which  creates  the  small  field  aligned  currents 
also  produces  a  convective  electric  field,  according  to  Maxwell’s  equations.  The  electric 
field  can  be  mapped  into  the  ionosphere  or  back  out  into  the  magnetosphere,  i.e.  the  field 
aligned  currents  may  be  driven  by  the  actions  of  the  ionosphere  or  the  magnetosphere. 

This  interactive  relationship  is  why  Birkeland  currents  are  an  essential  link  between  the 
solar  wind-magnetosphere  system  and  the  ionospheric  system  [Kelley,  1989:296-7], 

These  two  current  systems,  the  magneto  spheric  and  ionospheric,  have  a  great  deal 
of  influence  on  the  time  scale  and  intensity  of  geomagnetic  activity.  The  magnetospheric 
system  responds  much  quicker  to  the  solar  wind  inputs  than  the  ionospheric  current 
system.  The  enhanced  response  time  is  due  to  the  immediate  coupling  of  the  southward 
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IMF  and  the  Earth’s  magnetic  field,  resulting  in  the  erosion  of  the  Earth’s  field  when  the 
IMF  turns  southward.  The  majority  of  the  energy  from  the  solar  wind  is  “loaded”  into  the 
magnetosphere  in  this  time.  This  mechanism  has  been  termed  the  driven  model  for  solar 
wind-magnetosphere  coupling  because  these  parameters  directly  drive  the  reaction  of  the 
magnetosphere.  A  second  mechanism,  associated  with  the  unloading  model,  requires  the 
energy  input  to  be  stored  in  the  magnetotail,  and  later  released  to  modify  the  current 
systems.  The  response  time  of  the  unloading  model  is  on  the  order  of  8  hours.  When  the 
magnetotail  releases  the  stored  energy,  it  is  in  the  form  of  precipitated  particle  injection 
into  the  ring  current  or  auroral  electrojet.  This  unloading  response  is  consistently  apparent 
for  all  of  the  input  parameters  used  in  this  study,  and  will  be  discussed  further  using  the 
parameter  VBs,  which  describes  the  convective  electric  field  and  is  closely  related  to  the 
current  system. 

The  orientation  of  the  IMF  field  is  considered  to  be  the  most  important  factor 
when  determining  the  occurrences  of  geomagnetic  storms.  Southward-oriented  IMF  are 
thought  to  couple  with  the  Earth’s  magnetic  field  lines  and  allow  them  to  be  swept  back 
into  the  magnetotail  (Figure  5).  When  this  occurs,  the  magnetosphere  erodes  as  the  solar 
wind  moves  into  the  regions  where  the  magnetic  field  lines  and  magneto  spheric  particles 
have  been  evacuated. 

“The  energy  transfer  or  coupling  process  begins  when  enhanced  magnetic 
merging  is  initiated  on  the  dayside  magnetopause.  The  process  ends  when  the 
energy  is  irreversibly  dissipated  by  auroral  particle  precipitation.  Joule  heating  in 
the  ionosphere,  or  particle  injection  into  the  ring  current  and  when  energy  is  lost 
fi'om  the  magnetosphere  via  plasmoid  formation”  [Bargatze  et  al.,  1985:6387]. 
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Baumjohann  [1986]  states  “...magnetic  merging  is  the  dominant  coupling  process  and 
provides  roughly  90%  of  the  energy  input  (~  10*^  W)  into  the  magnetosphere.”  The  ring 
current  is  directly  affected  by  the  loss  of  these  particles.  The  ring  current  opposes 
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Figure  5.  Erosion  of  magnetic  field  line  in  the  presence  of  a  southward  IMF. 
(Adopted  from  Parks  [1991:300]  from  the  original  proposal  of  J.  Dungey  [1961]). 

the  Earth’s  magnetic  field  in  the  area  between  the  location  of  the  ring  current  and  the 
Earth  and  reinforces  the  Earth’s  magnetic  field  exterior  to  the  location  of  the  ring  current. 
In  an  attempt  to  reestablish  the  balance  between  the  ring  current  and  the  Earth’s  field, 
particles  are  pulled  in  from  the  magnetotail  as  the  Earth’s  magnetic  field  lines  are  eroded. 
A  disturbance  created  in  this  fashion  is  quantified  primarily  by  the  geomagnetic  index  Dsx 
[Baumjohann  and  Kamide,  1984]. 
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Concurrent  with  ring  current  enhancement,  another  response  to  increased  solar 
activity  begins.  As  the  ionized  particles  move  through  the  magnetic  fields  in  the 
magnetotail,  small  electric  fields  (~1  mV-m'^)  are  generated  [Kelley,  1989:264],  As 
mentioned  earlier,  plasma  is  a  good  electrical  conductor,  and  so,  the  magnetic  field  lines 
act  as  equipotential  surfaces.  As  the  distance,  ds,  between  these  surfaces  decreases,  the 
electric  field  increases  by 

B  =  -dVlds  (2) 

where  Els  the  electric  field  a:hd  dV  is  the  differential  potential.  However,  as  the  electrrc 
field  approaches  the  ionosphere,  the  density  of  ions  and  neutral  particles  increases,  and  the 
conductivity  decreases.  The  ionized  particles  contribute  to  the  current  density  in  the 
auroral  region  for  a  short  period  of  time,  and  increase  the  auroral  electrojet  [Kelley, 

1991 :468].  This  disturbance  is  predominately  quantified  by  the  geomagnetic  indices  AU 
and  AL,  which  represent  the  individual  strengths  of  the  eastward  and  westward  electrojets, 
respectively. 

Geomagnetic  Indices 

Geomagnetic  indices  have  a  dual  purpose;  first,  to  provide  information  on  the 
geomagnetic  activity  level  when  analyzing  phenomena  linked  to  it,  and  second,  to  study 
the  geomagnetic  activity  itself  and  its  response  to  various  parameters,  fi'om  which  one  can 
derive  information  about  the  magnetospheric  machinery. 


16 


“The  oldest  characterization  of  geomagnetic  activity  came  from  the  daily 
estimation  of  geomagnetic  disturbances,  but  nowadays  three  families  of  indices  are 
in  use; 

(1)  Dst  index,  calculated  near  the  magnetic  equator,  which  describes  the 
ring  behavior  [Sugiura,  1964] 

(2)  AU,  AL,  and  AE  indices,  calculated  at  auroral  latitudes,  which  gives 
information  on  the  maxima  of  the  auroral  electrojet  intensity  [Davis  et  al.,  1966] 

(3)  K  indices,  which  are  calculated  at  all  latitudes  but  are  shown  to  be 
appropriate  mainly  at  subauroral  latitudes;  planetary  indices  are  derived  from 
subauroral  K  indices”  [Menvielle  and  Berthelier,  1991]. 

From  the  description  above,  there  are  visible  differences  between  Dst  and  the 

family  of  “Ax”  indices.  The  distinct  difference  is  that  the  indices  measure  changes  in 

different  geographical  locations  around  the  world.  The  effects  which  register  as  changes 

in  Dst  occur  near  the  geomagnetic  equator,  while  AE,  AL,  and  AU  changes  are  seen  when 

phenomena  occur  in  the  auroral  region.  The  ring  current  is  built  from  the  motion  of  the 

trapped  energetic  particles  through  the  Earth’s  magnetic  field.  The  auroral  electrojets 

develop  due  to  atmospheric  instabilities  and  particle  precipitation  from  the  magnetosphere. 

They  are  also  influenced  by  the  currents  of  the  magnetotail. 

The  Dst  index  has  been  discussed  because  of  its  strong  relationship  with  the  solar 

wind-magnetosphere  coupling  and  its  contribution  to  the  K  index.  It  continues  to  be  used 

in  modeling  and  predicting  geomagnetic  activity  levels,  but  is  of  primary  importance  when 

related  to  the  effects  produced  by  the  ring  current. 

The  auroral  electrojet  indices  AE,  AU,  and  AL  were  introduced  by  Davis  and 

Sugiura  [1966]  as  a  measure  of  global  electrojet  activity  and  first  used  by  Amoldy  et  al. 

(1970)  in  a  solar  wind-magnetosphere  coupling  study  [Baumjohann,  1986].  The  AL  and 

AU  indices  are  defined  as  the  negative  and  positive  maxima,  respectively,  for  the  variation 
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of  the  horizontal  component  of  the  Earth’s  magnetic  field  (H)  in  the  auroral  region 
[Kelley,  1989:468],  The  perturbations  are  caused  by  the  Joule  heating  rate  of  the 
westward  and  eastward  electrojets,  respectively  [Perreault  and  Akasofii,  1978],  The  AE 
index  is  defined  as  the  separation  between  the  upper  and  lower  envelopes,  AE  =  AU  -  AL. 
Since  the  electrojet  currents  travel  in  opposite  directions,  AE  then  reflects  the  value  of  the 
total  maximum  electrojet  current. 

There  are  advantages  and  disadvantages  to  using  the  auroral  indices  (AL,  AU,  and 
AE)  in  solar  wind-magnetosphere  coupling  studies.  Since  the  AE,  AL,  and  AU  indices 
depend  on  measurements  made  in  high-latitude  regions,  disturbances  taking  place  in  the 
low-altitude  subauroral  and  equatorial  regions  do  not  manifest  themselves  in  these  indices. 
Another  disadvantage  is  that  the  electrojet  may  move  in  relation  to  the  station  taking 
measurements,  resulting  in  a  variation  being  recorded  in  the  absence  of  a  disturbance. 
These  anomalous  measurements  also  take  place  due  to  limited  number  of  stations 
observing  the  disturbances.  AE  has  an  advantage  over  AL  and  AU  separately  since  it  is 
only  influenced  by  zonally  uniform  non-electrojet  fields,  so  movements  of  the  electrojets 
generally  cancel  out.  Since  it  monitors  the  total  electrojet  current,  the  AE  index  is  more 
often  used  than  AL  or  AU.  An  advantage  in  using  AL  and  AU  over  AE  is  that  one  may 
separately  distinguish  convection  electrojets.  The  electrojets  characterize  direct 
dissipation  of  solar  wind  energy  and  are  related  to  the  sudden  dissipation  of  energy 
previously  stored  in  the  magnetotail  [Kamide  et  al.,  1985].  AE,  AL  and  AU  are 
advantageous  indices  to  researchers  since  they  are  measured  at  1  or  2.5  minute  time 
resolution  which  is  much  less  than  the  time  scale  of  most  substorms  and  all  major  storms 
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(tens  of  minutes  to  hours  for  substorms  and  several  hours  to  days  for  major  storms).  The 
base  value  for  these  indices  is  the  average  activity  of  the  5  internationally  quiet  days, 
which  is  called  Sq  (averaged  solar  quiet  variations  measured)  [Berthelier,  1992], 

Both  the  Dst  and  auroral  indices  contribute  to  the  creation  of  the  K  and  a  indices. 
The  K  and  a  indices  are  different  representations  of  the  same  measurement;  a  is  linearly 
scaled  while  K  is  on  a  logarithmic  scale  (Table  2-1).  The  indices  K  and  a  may  vary  with 
latitude,  longitude,  season,  or  universal  time,  so  a  standard  for  the  world  has  been 
developed.  The  Gottingen  indices  are  the  recognized  world  Kp  and  Ap  values  (planetary 
indices),  and  are  published  4  to  5  weeks  after  their  observation.  There  are  12  worldwide, 
preselected  observing  stations  (Table  2-2)  which  record  data  and  report  them  to 
Gottingen.  They  are  then  corrected  to  filter  out  any  dependence  of  the  indices  other  than 
variations  due  to  the  changing  solar  wind. 


Table  2-1.  Relationship  between  Kp  and  ap.  Note:  ap  is  in  units  of  2  nT,  so  a 
measurement  of  32  is  actually  64  nT.  (Menvielle  and  Berthelier,  1991). 


Kd  interval 

Oo  to  0+ 

ap.  nT 

0 

Kp  interval 

2+  to  3- 

ap,  nT 

9 

Kp  interval 

5-  to  5o 

ap,  nT 

39 

Kp  interval 

7o  to  7+ 

ap.  nT 

132 

0+  to  1- 

2 

3-  to  3o 

12 

5o  to  5+ 

48 

7+  to  8- 

154 

1- to  1o 

3 

3o  to  3+ 

15 

5+  to  6- 

56 

8-  to  8o 

179 

1o  to  1  + 

4 

3+  to  4- 

18 

6-  to  6o 

67 

8o  to  8+ 

207 

1+to  2- 

5 

4-  to  4o 

22 

6o  to  6+ 

80 

8+  to  9- 

236 

2-  to  2o 

6 

4o  to  4+ 

27 

6+  to  7- 

94 

9-  to  9o 

300 

2o  to  2+ 

7 

4+  to  5- 

32 

7-  to  7o 

111 

9o  to  9+ 

400 
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Table  2-2.  Magnetic  observatories  selected  for  determining  Gottingen  Ap. 
(Gehred  et  al.,  1995) 


Symbol 

Observatory 

Geoaraohic 

Geomagnetic 

Le 

Lewick,  Shetland  Islands 

Lat 

60  08N 

Long 

358  49E 

Lat 

+  62.5 

Long 

88.6 

Lo 

Lovo,  Sweden 

59  21 N 

17  50E 

+  58.1 

105.8 

Si 

Sitka,  Alaska 

57  04N 

224  40E 

+  60.0 

275.4 

Rs 

Rude  Skov,  Denmark 

55  51 N 

12  27E 

+  55.8 

98.5 

Es 

Eskdalemuir,  Scotland 

5519N 

356  48E 

+  58.5 

82.9 

Me 

Meanook,  Alberta,  Canada 

54  37N 

246  40E 

+  61.8 

301.0 

Wn 

Wingst,  West  Germany 

53  45N 

9  04E 

+  54.5 

94.0 

Wi 

Witteveen,  Netherlands 

52  49N 

6  40E 

+  54.2 

91.0 

Ha 

Hartland,  Devon,  England 

51  OON 

355  31 E 

+  54.6 

79.0 

Ag 

Agincourt,  Ontario,  Canada 

43  47N 

280  44E 

+  55.0 

347.0 

Fr 

Fredericksburg,  Virginia 

38  12N 

282  38E 

+  49.6 

349.9 

Am 

Amberley,  New  Zealand 

43  09S 

172  43E 

-47.7 

252.5 

The  Kp  and  ap  indices  are  measurements  of  the  variation  in  the  horizontal 
component  (AH)  of  the  geomagnetic  field,  filtered  and  averaged  over  on  3  hour  intervals, 
which  is  representative  of  the  time  scales  of  substorms  in  the  subauroral  latitudes  (1  to  2 
hours  in  duration)  [Berthelier,  1992],  The  base  values  are  those  calculated  by  a  computer 
algorithm  for  the  solar  regular  variation  Sr,  which  are  the  magnetic  variations  of  a 
geomagnetically  quiet  day.  According  to  Menvielle  et  al.  [1991],  this  gives  the  simplest 
smooth  curve  of  “K  indices-type”  variations  corresponding  to  Sr  variations. 


WIND/SWIM  Satellite 


The  SWIM  User’s  Guide  and  Reference  [Phillips  Laboratory,  1995]  provides  a 
great  deal  of  information  not  only  about  the  satellite,  but  its  proposed  functions  and 
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capabilities  as  well.  The  SWIM  (GL-804)  portion  of  the  satellite  is  the  primary  concern  in 
this  research,  as  these  components  measure  and  return  the  data  required.  Since  its  launch 
in  1994,  SWIM  has  been  a  proof-of-concept  experiment  designed  to  demonstrate  space 
weather  forecasting  using  real-time  upstream  solar  wind  data.  The  satellite  remains  in  a 
stable  trajectory  at  LI,  the  point  in  space  on  the  Earth-Sun  line  where  the  gravitational 
forces  balance  (~  240  Re).  The  experiments  on-board  will  run  at  least  until  NASA’s 
Advanced  Composition  Explorer  (ACE)  is  launched  in  late  1997. 

Two  experiments  are  of  specific  interest  which  will  measure  constituents  and 
properties  of  the  solar  wind  and  provide  the  necessary  data:  the  Solar  Wind  Experiment 
(SWE)  and  the  Magnetic  Field  Instrument  (MFI).  They  are  explained  in  the  SWIM  User’s 
Guide  and  Reference  as:  SWE  provides  the  standard  plasma  parameters  of  the  solar 
wind:  velocity,  density,  and  temperature.  MFI  provides  the  three  components  of  the 
magnetic  field  of  the  solar  wind  (Bx,  By,  and  Bz)  [Phillips  Laboratory,  1995:5],  Along 
with  the  parameters  mentioned  above,  one  simple  combination  will  also  be  recorded  and 
returned,  VBs,  the  product  of  the  solar  wind  bulk  speed  and  the  rectified  vertical 
component  of  the  IMF.  This  index  is  a  hybrid  parameter  which  combines  the  dynamic  and 
magnetic  aspects  of  the  solar  wind. 

Models 

As  noted  by  Clauer  [1986],  linear  prediction  filtering  (LPF)  was  originally 
developed  by  mathematician  Norbert  Weiner  [1942]  for  applications  dealing  with 
continuous  time  series.  Levinson  [1942]  adapted  this  technique  to  discrete  time  series. 
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and  his  work  was  included  in  Wiener’s  book  as  an  appendix.  Magnetospheric  researchers 
have  taken  advantage  of  the  LPF  technique,  and  have  begun  to  use  it,  first  Arnoldy 
[1971],  as  a  means  of  characterizing  and  predicting  geomagnetic  activity  using  solar  wind 
data. 

The  use  of  LPF  is  restricted  by  two  basic  assumptions.  The  first  assumption  is  that 
the  data  is  stationary.  Stationarity  indicates  that  the  statistical  properties  of  the  input  and 
output  do  not  change  with  time  [Makridakis  and  Wheelwright,  1978:263].  A  criterion  to 
establish  that  the  data  is  stationary  has  been  developed.  Examination  of  the 
autocorrelations  of  the  data  will  indicate  95%  of  the  coefficients  lie  within  a  range  of  plus 
or  minus  1.96  standard  deviations  fi'om  the  mean  of  the  autocorrelations.  The  second 
assumption  is  the  input  and  output  are  related  by  a  linear  relationship.  If  they  are  not 
related  by  a  linear  relationship,  LPF  results  are  erroneous  since  the  explicit  purpose  of 
using  LPF  is  to  return  a  linear  filter.  Therefore,  linearity  and  stationarity  have  been 
considered  pertinent  assumptions  for  this  data  set. 

Three  authors’  works  are  of  historical  interest  and  important  to  the  undertaking  of 
this  thesis.  Bargatze  et  al.  [1985],  Clauer  [1986],  and  ’Vassiliadis  et  al.  [1995]  have 
written  papers  which  detail  the  use  of  LPF  with  respect  to  the  problem  of  solar  wind- 
magnetosphere  coupling.  Bargatze  et  al.  [1985]  assembled  a  data  set,  and  established  the 
magnetospheric  impulse  response  using  moving-average  linear  filter  analysis.  Clauer 
[1986]  used  the  Bargatze  data  set  to  show  the  applications  of  LPF  in  magnetospheric 
studies  with  different  geomagnetic  indices  and  input  time  series.  Vassiliadis  et  al.  [1995] 
advanced  this  research  by  introducing  the  use  of  a  state-input  space  method  using 
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nonlinear  equations  of  state  for  the  magnetosphere.  Vassiliadis  furthered  improved  his 
attempts  by  working  with  multiple  input  time  series  in  his  analysis.  Although  he  did  not 
use  LPF,  he  broadened  the  approach  to  modeling  geomagnetic  activity  by  using  multiple 
inputs  to  capture  greater  detail  of  the  interactions  taking  place. 

LPF  uses  a  filter  to  model  the  most  general,  linear  relationship  between  the 
measured  magnetospheric  and  solar  wind  quantities  as  shown  below  [Bargatze, 
1985:6387]: 

*00 

0  (3) 

In  this  continuous  time  model,  Hx  is  the  filter.  Ox  is  the  output  parameter,  It-t  is  the  input 
parameter  at  time  T-x,  T  is  the  time  of  the  observation,  and  x  is  the  time  lag.  The  variable 
names  Ox,  Hx,  and  Ix-x,  scalars  for  single  input-output  models,  will  remain  the  same  for  all 
of  the  models  demonstrated.  That  is,  even  for  multiple  input  models,  the  input  variable 
will  remain  I  x-T,  even  though  it  will  have  several  components.  To  use  this  model  for 
multiple  inputs  or  outputs,  substitution  of  matrices  must  be  made  for  the  scalars. 

Using  Equation  [3],  the  geomagnetic  index  AL,  output,  can  be  predicted  once  the 
filter  H  has  been  determined.  For  given  input(s)  of  the  solar  wind  coupling  functions,  LPF 
constructs  the  filter  Hx  by  minimizing  the  least-square  errors  associated  with  estimated  and 
observed  outputs.  Although  the  integral  in  Equation  [3]  extends  to  infinity,  practical 
applications  use  a  predetermined  number  of  coefficients  which  model  the  effects  of  the 
input  reasonably.  These  coefficients  correspond  to  the  time  lags  which  have  a  significant 
impact  on  the  calculation  of  the  output  Ox- 
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Bargatze  et  al.  [1985]  and  Clauer  [1986]  used  the  magnetic  index  AL  as  the 
output  parameter  (Ot)  and  the  solar  wind  parameter  VBs  (the  product  of  the  solar  wind 
bulk  speed  and  the  rectified  Bz  component  of  the  IMF)  as  the  input  (It-t)  for  the  LPF 
model  below.  Clauer  [1986]  extended  this  research  to  include  other  outputs  to  be  tested 
using  similar  coupling  functions  as  inputs  over  discrete  time  intervals.  Equation  [3]  then 
becomes 

00 

Ot=  ^ 

T  =  0  (4) 

The  only  difference  between  the  continuous  and  discrete  models  is  the  form  of  the 
equations  being  solved  under  least  squares  error  techniques.  For  Equation  [3],  the 
equations  are  integrals  while  for  Equation  [4]  the  equations  contain  summations.  Single 
channel  (single  input-single  output)  and  multichannel  filters  (multiple  input-single  output) 
were  used  to  predict  AL  in  this  thesis  while  Clauer  used  only  single  channel  models. 

Clauer  [1986]  made  use  of  several  parameters  as  input:  VBs,  V^Bs,  and 
VBT^sin'‘(0/2).  The  first  is  the  same  parameter  used  by  Bargatze  [1985],  the  second  was 
introduced  by  Murayama  and  Hakamada  [1975],  and  the  third  was  initially  used  by 
Perreault  and  Akasofu  in  1978.  The  coupling  function  inputs  used  in  solar-wind 

_ magnetosphere  studies  fall  into  three  major  categories:  (1)  simple  expressions,  (2)  electric 

field-related,  and  (3)  power-related.  VBs  is  the  rectified  solar  wind  convective  electric 
field  developed  by  Rostoker  et  al.  [1972].  V^Bs  is  considered  a  simple  coupling 
expression,  i.e.,  just  a  relationship  between  two  components  of  the  solar  wind. 
VBT^sin‘*(0/2),  called  the  epsilon  parameter,  e,  is  a  coupling  function  relating  the  power  in 


24 


the  solar  wind  to  geomagnetic  activity.  These  functions  are  the  actual  inputs,  I,  to  the 
LPF  model. 

Vassiliadis  et  al.  [1995],  employing  a  state-input  space  formalism  rather  than  LPF, 
used  the  same  input  parameters  as  Clauer  [1986].  State-input  space  models  are  a  pattern- 
recognition  technique  which  classifies  activity  “patterns”  in  time  series  data  by  associating 
them  with  points  in  the  space  vector  [I  O],  where  I  and  O  are  the  input  and  output  vector 
spaces,  respectively.  The  space  vector  is  a  multi-dimensional  vector  whose  points 
represent  a  magnetospheric-solar  wind  event.  The  input  is  related  to  a  reference  point  of 
the  output  space  vector  by  its  magnitude  and  orientation.  A  predefined  distance  is  set 
about  the  reference  point,  and  any  “neighbors”,  output  points  fi-om  similar  inputs,  are  used 
to  determine  what  the  output  values  corresponding  to  the  new  input  will  return 
[Vassiliadis  et  al.,  1995]. 

Vassiliadis  et  al.  [1995]  discusses  the  comparisons  between  linear  filters  and  their 
predictive  capabilities  and  those  using  nonlinear  techniques.  The  values  of  the  correlations 
provided  by  Vassiliadis  and  Clauer  will  be  used  in  the  verification  and  comparison  of  the 
linear  filters’  predictions. 

Verification 

The  primary  means  of  verification  will  be  to  use  correlations  between  the  measured 
AL  index  and  the  predicted  AL  returned  by  the  models.  The  prediction-observation 
correlation  relationship,  used  by  Vassiliadis  [1995],  establishes  the  correlation  between  a 
predicted  and  observed  time  series  over  some  interval  of  time.  Least  squares  error 
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minimization  of  the  time  series  is  inherent  in  the  LPF  technique,  and  adds  an  additional 
support  in  achieving  the  optimal  models. 

LPF  is  based  on  minimizing  the  least  squares  error  of  the  time  series,  and 
producing  the  most  general  linear  filters  (functions  which  transform  the  input  into  the 
output)  possible.  In  determining  the  filter  coefficients  by  this  criterion,  one  must  minimize 
the  error  between  the  different  time  series  by  subtracting  the  actual  or  measured  output 
from  the  computed  or  predicted  output  [Clauer,  1986:41], 

The  prediction-observation  correlation  relationship  is  defined  as 


^.J^{AL,,,-\AL\y{AL,,,-\AL) 

1  v-1 


where  Cal  represents  the  correlation  value  between  the  observed  and  predicted  time  series 
over  a  given  time  interval  T,  T  is  the  time  interval  over  which  the  series  is  viewed,  i  is  the 


time  increment,  AL  is  the  value  of  the  observed  index  AL,  AL  is  the  value  of  the 
predicted  index,  and  Oal  and  cr -£  are  the  standard  deviations  of  the  observed  and 


predicted  AL  indices,  respectively. 

The  predicted  values  of  AL  will  be  correlated  with  the  measured  values  of  AL  to 
determine  the  accuracy  at  which  the  models  performed.  Previous  works  [Bargatze,  1985; 
Clauer,  1986;  Vassiliadis,  1995]  have  used  this  method  to  compare  the  results  of  single 
input  LPF  and  state-input  space  methods  with  measured  values.  Vassiliadis  [1995:3503] 
reports  the  average  correlations  between  predicted  and  measured  values  of  AL  to  lie 
between  0.50  and  0.65. 
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III.  Methodology 


In  this  chapter,  the  methodology  and  approach  to  predicting  the  AL  index  will  be 
presented  in  seven  sections.  The  first  section  will  describe  the  coordinate  system  in  which 
the  measurements  of  the  solar  wind  data  were  made.  The  second  section  will  show  the 
identification  and  construction  of  the  coupling  functions  which  will  be  used  as  inputs  to 
the  models.  The  third  section  will  discuss  the  terminology  used  with  the  models,  the  types 
of  models  which  can  be  used,  and  how  each  type  of  model  approaches  prediction.  Section 
four  establishes  the  procedure  for  construction  of  the  models  used  for  predicting  AL  in 
this  research.  The  fifth  section  will  explain  the  approaches  to  prediction,  the  prediction 
requirements,  and  the  comparisons  of  the  model  outputs  with  persistance.  The  sixth  and 
seventh  sections  will  describe  the  use  of  the  impulse  and  frequency  response  functions  as 
tools  for  the  analysis  of  the  model’s  output  and  filter  coefficients. 

Reference  Coordinate  System 

The  Bargatze  data  set  provides  raw  data  from  the  WIND/SWIM  satellite  in  GSM 
coordinates.  The  GSM  coordinate  system  is  a  right-handed  Cartesian  system  with  its 
origin  at  the  center  of  the  Earth  [Figure  6].  The  positive  x  axis  is  directed  toward  the  Sun, 
and  the  z  axis  lies  in  the  plane  containing  both  the  x  axis  and  the  geomagnetic  dipole  axis. 
With  these  conditions  established,  the  y  axis  can  not  lay  in  the  ecliptic  plane  at  all  times. 
Therefore,  the  z  axis  will  oscillate  with  the  geomagnetic  coordinates  (through  23  degrees) 
as  the  geomagnetic  dipole  axis  rotates  over  a  period  of  one  day.  GSM  coordinates  are 
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useful  to  reference  data  from  the  distant  magnetosphere  since  the  whole  magnetosphere  is 
expected  to  shift  position,  to  a  first  approximation,  as  the  dipole  axis  moves  [Handbook  of 
Geophysics,  1985:4-3],  Additional  details  concerning  the  data  set  can  be  found  in  the 
Scope  and  Limitations  section  of  Chapter  I. 


Z  axis  oscillation 


Figure  6.  The  Geocentric  Solar-Magnetospheric  (GSM) 
Coordinate  System. 


Identification  and  Construction  of  Model  Inputs 

Bargatze  has  ordered  the  data  set  into  34  intervals  of  increasing  geomagnetic 
activity,  moving  from  low  activity  (quiet)  through  moderate  to  strong  activity 
(geomagnetic  storm  levels).  Appendix  A  reveals  the  data  format  and  units  for  all  of  the 
data  to  be  used  in  this  study.  The  beginning  and  ending  points  of  each  interval  are 
identified  in  Appendix  B  as  well  as  the  number  of  hours  spanned  by  each  interval,  number 
of  data  entries,  and  the  input  parameters  to  be  used,  averaged  for  each  interval. 
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In  order  to  model  the  geomagnetic  activity,  the  original  data  must  be  transformed 
and  built  into  coupling  functions,  or  inputs.  Coupling  functions  are  parameters  whose 
values  are  in  some  way  related  to  an  observed  or  predicted  phenomenon.  The 
construction  of  these  coupling  functions  requires  some  alteration  and  assembly  of  the 
original  data  parameters.  The  Bz  component  of  the  IMF  is  rectified  and  identified  as  B- 
South,  Bs,  where  ; 


I  0,  for  >  0 

forB^<Q. 


(6) 


This  southward  component  of  the  IMF  is  utilized  to  help  form  2  of  the  3  inputs  which  are 
used  by  LPF  techniques,  namely  Bs  and  VBs. 

A  second  parameter,  highly  correlated  with  the  geomagnetic  activity,  is  the  power 
of  the  solar  wind  incident  on  the  magnetosphere  [Gonzalez,  1990].  This  quantity, 
s  =  VBT^sin'‘(0/2),  is  constructed  from  the  data  set  according  to  the  following  procedures: 


tan 


n  -  tan 


for  B^>0 
for  B^<0 


(7) 


(8) 
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where  Bx,  By,  and  Bz  are  the  components  of  the  IMF,  6  is  the  angle  at  which  the  IMF 
interacts  with  the  magnetosphere,  and  By  is  the  total  IMF  strength. 


Model  Terminology 

This  thesis  investigated  autoregressive  (AR),  moving-average  (MA),  and 
autoregressive  moving  average  (ARMA)  models  used  in  representing  and  predicting  time 
series  data  [Makridakis  and  Wheelwright,  1978:252-309],  An  AR  model  uses  a  linear 
combination  of  past  output  values  to  compute  a  new  value  for  the  output.  The  general 
equation  form  which  represents  all  three  different  model  types  is  given  by: 


a^-0{t)  +  a2-0{t-\)  + ...  -Oit-r)  = 


(9) 


where  O  is  the  output  time  series,  I  represents  the  input  time  series,  and  a  and  b  are  the 
coefficients  which  are  computed  through  least  square  error  minimization  using  LPF. 

When  using  multiple  inputs  to  the  LPF  models,  each  of  the  inputs  will  have  a  separate 
series  of  b  coefficients  and  I  inputs  while  there  will  be  only  one  set  of  a  coefficients  for  any 
model. 

For  AR  models,  the  only  input  variable  is  an  earlier  output  value,  so  the  term 
auto  is  applicable,  while  regressive  refers  to  going  back  along  its  own  timeline.  Using 
least  squares  minimization,  coefficients  for  these  earlier  values  are  determined  and  used  to 
calculate  the  new  output.  MA  models  use  linear  combinations  of  the  past  errors  between 
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actual  and  predicted  output  values  with  computed  coefficients  to  influence  the  importance 
of  each  error.  The  predicted  output  values  mentioned  above  are  actually  the  calculated 
values  of  the  output  at  time  t.  The  AR  and  MA  models  are  considered  to  be  special  cases 
of  a  more  general  model.  An  ARMA  model  uses  linear  combinations  of  both  the  past 
values  and  past  errors  to  achieve  the  new  output  [Makridakis  and  Wheelwright,  1978]. 

An  important  feature  of  each  of  the  types  of  models  is  that  they  have  the  ability  to 
increase  their  order.  The  order  of  a  model  identifies  the  number  of  earlier  values  entering 
the  linear  equation  relating  “output”  to  “input”.  For  example,  an  AR  model  of  order  2 
includes  the  current  value  and  the  previous  2  values  of  the  output  variable.  Likewise,  an 
MA  model  of  order  2  uses  the  current  error  and  the  previous  2  errors  between  the 
predicted  and  actual  output.  An  ARMA  model  which  combines  these  two  models  would 
be  classified  as  ARMA(2,2). 

As  the  models  increase  in  order,  successively  larger  numbers  of  previous  errors  or 
values  will  be  required  for  the  model  to  begin  optimizing  the  filter  coefficients.  For  large 
time  series,  this  poses  a  small  problem  since  a  great  deal  of  prior  data  can  be  used  to 
optimize  the  filter  before  real-time  analysis  is  required.  For  small  time  series,  however, 
this  is  a  significant  problem.  If  there  are  an  insufficient  number  of  data  points  to  allow 
optimization  of  the  filter  coefficients,  initial  estimates  must  be  made  so  that  the  filters 
approach  the  ideal  values  more  quickly.  This  can  be  done  by  solving  a  system  of  nonlinear 
simultaneous  equations  of  n  dimension,  where  n  is  the  order  number.  A  detailed 
description  of  the  procedure  to  develop  the  system  of  equations  can  be  found  in  the 
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mathematical  supplement  appendix  of  chapter  9  of  Interactive  Forecasting:  Univariate 
and  Multivariate  Methods  [Makridakis  and  Wheelwright,  1978], 

The  input  variables  need  not  begin  with  the  current  time.  In  many  circumstances,  a 
delay  is  introduced.  A  delay  indicates  the  value,  measured  in  time  intervals,  of  the 
temporal  offset  or  lag.  For  example,  a  delay  of  4  would  mean  the  input  for  an  AR,  MA,  or 
ARMA  model  would  begin  with  the  4th  previous  value  or  error  as  the  current  value  or 
error.  If  the  order  was  2  for  the  AR  or  MA  models,  the  fourth,  fifth,  and  sixth  previous 
values  or  errors  would  be  used  to  determine  the  next  output.  For  ARMA,  the  same 
previous  values  and  errors  would  be  used  in  the  models  together.  In  all  of  these  cases, 
associated  with  the  values  and  errors  is  a  set  of  coefficients  which  determine  how  they 
influence  the  new  output. 

The  representation  for  a  second  order  AR  or  MA  model  with  a  time  delay  of  4 
used  to  compute  the  next  value  of  an  output  is,  therefore,  given  by: 

a,  •  0(0  =*o  •  -  4)  +  •  I{t  -  5)  +  •  lit  -  6)  (10) 

where  0  is  the  output,  I  are  the  past  values  of  the  outputs  or  errors  of  the  AR  or  MA 
models,  respectively,  used  as  inputs  and  a,  and  b,  are  the  coefficients  to  be  used  in 
construction  of  the  filter  for  this  model.  The  equation  for  an  ARMA  model  would  be  very 
similar,  the  only  difference  being  the  right  hand  side  of  Equation  [10]  would  contain  an 
additional,  similar  set  of  terms.  One  set  of  inputs  and  coefficients  would  pertain  to  the 
past  values  of  input  and  the  other  to  past  errors. 
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The  coefficients  in  Equation  [9]  and  [10]  are  the  filter  coefficients  or  amplitudes. 
The  filter  is  built  by  selecting  a  specific  model,  delay,  and  order  for  the  data  set.  On  a 
point-by-point  basis,  the  filter  coefficients  would  change  continually  throughout  a  data  set. 
By  minimization  of  the  least-squares  errors,  partial  derivatives  of  Equation  [10]  with 
respect  to  the  coefficients  are  computed,  and  the  filter  coefficients  are  optimized  over  a 
given  time  interval.  This  optimization  technique,  the  foundation  for  adaptive  filtering 
processes,  provides  the  best  fit  for  linear  methods  over  the  whole  data  set. 

The  construction  and  analysis  of  the  models  was  performed  using  a  suite  of 
procedures  in  the  Matlab®  System  Identification  Toolbox®.  The  ARX  function  is  given 
the  input-output  (10)  data,  in  the  form  of  coupling  functions,  the  order  of  the  AR  model 
to  use  on  the  input  and  output,  and  the  time  delay  and  computes  the  loss  function  for  each 
model  [Ljung,  1992].  The  loss  functions  for  each  different  model  were  calculated  using 
the  ARXSTRUC  command.  This  command  computes  the  loss  function  of  each  model  by 
a  cross  validation  technique,  also  known  as  Akaike’s  Final  Prediction  Error  (FPE) 
criterion.  The  loss  function  over  a  desired  interval  is  the  difference  between  the  actual 
output  value  and  the  predicted  value  generated  by  the  model’s  least-squares  error 
estimate.  The  loss  functions  were  then  sorted,  by  the  SELSTRUC  command,  to  find  the 
smallest  FPE,  which  is  given  by 


FPE  =  *  V 

\-nlN 


(11) 


where  n  is  the  total  number  of  estimated  parameters,  N  is  the  length  of  the  data  record, 
and  V  is  the  loss  function  for  the  structure  in  question  [Ljung,  1992]. 
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There  are  several  types  of  loss  functions  which  can  be  computed  depending  on  the 
size  of  the  data  set  or  how  the  data  set  is  used  in  the  model.  For  small  data  sets  or  where 
the  interval  to  be  tested  includes  all  of  the  data  points,  a  different  method  of  computing  the 
least  squared  errors  is  suggested.  Using  the  Matlab®  software,  there  are  at  least  two 
alternatives  to  using  Akaike’s  FPE  method.  They  are  Akaike’s  Information  Theoretic 
Criterion  (AIC)  and  Rissanen’s  Minimum  Description  Length  criterion  (MDL)  [Ljung, 
1992: 1-52-3].  They  are  given  by: 


AIC  « log 


N 


(12) 


MDL  = 


l  +  log(A)-^ 


(13) 


respectively.  The  variables  used  in  Equations  [12]  and  [13]  are  the  same  as  those  used  in 
Equation  [11]. 

Model  Construction 

The  construction  of  the  models  to  predict  the  AL  index  consists  of  several  steps. 
The  first  step  involves  the  testing  of  the  coupling  fimctions  or  inputs  to  determine  if  they 
are  going  to  provide  good  correlations.  If  the  coupling  functions  are  not  related  to  the 
output  at  all,  there  is  no  reason  to  build  the  models  containing  them.  Secondly,  the 
models  are  built  using  the  coupling  functions  and  the  measured  AL  values  using  LPF 
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techniques.  Finally,  the  models’  outputs,  the  predicted  AL  for  each  model,  are  compared 
by  correlating  the  measured  and  predicted  AL  values. 


The  measured  AL  and  coupling  function  values  are  correlated  to  determine  the 
prudence  of  using  the  coupling  function  as  an  input  to  a  model.  Matlab®  computes  the 
cross  correlations  between  the  inputs  and  the  measured  values  of  AL  using  the 
CORRCOEF  command.  CORRCOEF  uses  the  covariances  of  the  input  and  output 
matrices  to  compute  the  correlation  coefficients  matrix  by  Equation  [14],  where 

CQJ) 


Corrcoef  = 


7(C(/,/)  *  C(J,j) 


(14) 


and  C  is  the  covariance  matrix  for  the  input(s)  and  output  and  (i,j)  indicate  the  respective 
covariance  elements.  If  the  input(s)  correlated  perfectly  with  the  AL  index,  the 
correlation  values  would  equal  unity,  indicating  input(s)  and  output  amplitudes  and  phases 
are  coincident.  The  proximity  of  the  correlation  values  to  unity  indicates  the  accuracy  of 
fit  between  the  input(s)  and  the  output. 

Once  the  inputs  have  been  tested  to  insure  they  will  provide  good  results,  the 
coupling  functions  can  be  entered  into  the  model.  The  filter  coefficients  were  created 
using  the  ARX  command,  which  performs  AR  techniques  of  order  X  to  return  the  least- 
squares  estimate  of  the  filter  coefficients  for  each  model.  Models  were  constructed  in 
groups  of  several  thousands  at  a  time,  encompassing  orders  from  ARMA(1,1)  to 
AR]V[A( 150,200)  with  time  lags  extending  from  0  to  250  time  steps,  or  real-time  lags  of  0 
to  10.5  hours. 

The  filter  coefficients  which  were  generated  by  the  models  were  saved  and  used  to 
predict  values  of  the  AL  index.  The  input(s)  and  filters  were  used  by  the  IDSIM 
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command  to  predict  the  output  values  of  AL.  IDSIM  uses  a  built-in  FILTER  function  to 
implement  a  standard  difference  equation  and  solve  for  the  output  values  [Matlab,  1992], 
The  difference  equation  uses  all  of  the  filter  coeflBcients  and  time  delays  recorded  in  a 
model  to  produce  the  simulated  output  using  the  input  matrices  (one  or  more)  being 
investigated.  The  loss  functions,  discussed  above,  were  then  computed  to  find  the  optimal 
set  of  filter  coefficients  for  each  coupling  function. 

Once  the  output  has  been  generated.  Equation  [14]  is  again  used  to  determine  how 
well  the  model  has  performed  by  calculating  the  correlation  matrix  between  the  predicted 
and  measured  values  of  AL.  Typical  values  for  the  correlation  have  averaged  from  0.50  to 
0.65  in  previous  studies,  depending  on  the  coupling  functions  used  as  input  [Vassiliadis, 
1995:3503].  This  correlation  describes  the  ability  of  the  model  to  predict  the  AL  index 
given  the  same  input  values  over  the  same  period  of  time.  The  correlation  values  achieved 
using  single  and  multiple  inputs  for  these  models  will  be  discussed  in  Chapter  IV  and  the 
actual  values  can  be  seen  in  tabular  format  in  Appendix  D. 

Prediction 

The  prediction  of  geomagnetic  activity  has  a  large  number  of  possible  users  in  both 
the  military  and  industry.  Most  users  would  request  an  8  - 12  hour  forecast  of 
geomagnetic  activity  to  enable  them  to  terminate  current  operations  and  protect  assets,  or 
to  allow  time  to  redirect  operations  to  alternative  methods.  Currently,  such  long  range 
predictions  are  based  on  the  use  of  persistance  and  a  forecaster’s  discretion.  Although 
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these  methods  are  currently  accepted  as  the  most  reliable,  a  computer  model  with  a 
guaranteed  level  of  reliability  over  this  period  of  time  would  gain  acceptance  very  quickly. 

The  models  used  in  this  research  are  capable  of  making  forecasts  nearly  this  far 
into  the  future.  This  capability  is  directly  related  to  the  location  of  the  satellite  recording 
the  solar  wind  parameters  and  the  model’s  ability  to  predict  AL  several  time  steps  into  the 
future.  The  data  which  is  received  by  the  WIND/SWIM  satellite  will  precede  the  arrival 
of  the  physical  components  of  the  solar  wind  at  the  magnetopause  by  approximately  45 
minutes.  However,  for  high  solar  wind  speeds,  this  time  may  be  more  than  halved.  In 
addition  to  this  lead  time,  the  solar  wind  requires,  theoretically,  at  least  one  hour  to 
interact  with  the  magnetosphere.  This  provides  input  to  the  models,  once  the  data  are 
incorporated  into  coupling  functions,  nearly  2  hours  prior  to  any  expected  increase  in 
geomagnetic  activity.  By  combining  the  time  delays  inherent  in  the  data  and  short  term 
predictions,  of  approximately  2.5  hours,  by  a  model,  a  total  forecast  time  of  3.5  to  4.5 
hours  can  be  achieved.  Thus,  a  more  detailed  forecast  in  terms  of  both  onset  time  and 
magnitude  of  expected  geomagnetic  activity  is  achievable. 

In  order  for  the  models  to  make  these  predictions  correctly,  two  major 
prerequisites  must  be  fulfilled.  The  first  requirement  is  that  there  must  be  input  data  for 
the  model  to  run  and  there  cannot  be  gaps  in  the  data  set.  If  there  are  gaps  in  the  data  or 
no  data  at  all,  the  model  cannot  run  without  incorporating  interpolation  or  substitution. 
From  Equations  [3]  or  [4],  the  model  needs  both  an  input  and  the  predetermined  filter 
coefficients  in  order  to  predict  the  output.  The  second  major  requirement  is  that  the  input 
data  must  be  measured  on  the  same  time  scale  as  the  data  which  was  used  to  develop  the 
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filter  coefficients.  If  this  requirement  is  not  fulfilled,  the  predicted  AL  values  will  not  be 
comparable  with  the  measured  AL  values. 

The  different  types  of  prediction  which  can  be  performed  with  completed  models 
are  the  single-step  and  the  block  forecast  methods.  The  first  method,  the  single-step 
method,  predicts  each  step  out  to  the  time  desired.  By  using  each  predicted  value  of  AL 
as  an  actual  value  of  AL,  the  single-step  method  propagates  until  it  has  reached  the 
desired  prediction  time.  The  single-step  method  of  prediction  should  not  allow  any 
significant  errors  to  occur  as  each  value  of  AL  is  generated  using  actual  values  of  the  solar 
wind  input  and  a  proven  filter. 

The  second  method,  block  method,  uses  past  data,  propagates  it  forward  in  time, 
and  reuses  it  as  a  new  set  of  input  data.  Even  if  the  solar  wind  parameters  change 
radically  over  a  time  period  of  half  an  hour,  this  method  can  produce  reasonable  results. 
This  eflfect  can  be  explained  by  considering  the  impulse  response  of  the  system.  The 
“peaks”  of  the  impulse  response  show  that  the  system  can  “remember”  inputs  for 
approximately  90  minutes.  Therefore,  small  blocks  of  past  input  data  can  be  replicated 
and  brought  forward  in  time  to  keep  a  model  running  without  severe  effects  on  the  output. 

The  main  detriments  to  this  method  are  that  the  values  of  the  output,  AL,  are  not 
known  ahead  of  time,  and  if  there  are  significant  variations  in  the  solar  wind  data,  the 
model  will  under-  or  overpredict  the  actual  values  of  AL,  dependent  on  the  characteristics 
of  the  data  which  are  made  use  of  in  the  forecast.  If  the  data  used  in  the  forecast  contains 
solar  wind  parameters  which  produce  strong  geomagnetic  activity,  the  model  will  forecast 
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extended  periods  of  strong  geomagnetic  activity  when,  actually,  the  activity  may  decrease 
as  the  solar  wind  components  drop  in  intensity. 

The  lack  of  input  data  is  the  primary  restriction  of  the  models  to  forecasting  out  to 
tens  of  hours  or  days.  Forecasts  made  using  persistance  can  be  used  for  general  activity 
levels  of  geomagnetic  activity,  but  do  not  provide  specific  onset  times  or  specific  levels  of 
activity.  Generalized  forecasts  can  be  made  by  recording  the  positions  of  long  lasting 
disturbances  and  features  on  the  Sun,  and  predicting  their  return.  By  measuring  the 
intensity  of  the  disturbances  on  the  Sun  and  the  geomagnetic  activity  level  they  generated, 
a  relationship  can  be  developed  to  predict  average  activity  levels  on  a  monthly  time  scale. 
This  method  is  not  useful  to  organizations  which  must  continue  or  can  only  shut  down 
operations  for  short  periods  of  time. 

Persistance,  on  the  other  hand,  takes  the  measure  of  the  previous  time  period’s 
activity  level  and  uses  it  to  predict  the  value  of  the  next  time  period’s  activity  level. 
Obviously,  this  method  does  not  consider  any  changes  in  the  solar  wind  parameters  over  a 
given  interval.  As  such,  it  will  be  unable  to  predict  sudden  increases  or  decreases  in  the 
geomagnetic  activity  levels.  Since  there  are  eight  time  periods  in  each  day  for  predictions, 
this  method  may  also  predict  unreasonably  high  or  low  levels  of  activity.  Again,  this 
method  should  not  be  forwarded  to  users  since  it  does  not  consider  changes  which  may 
take  place  in  the  solar  wind  and,  therefore,  in  the  geomagnetic  activity. 
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Impulse  Response 


The  impulse  response  is  the  magnitude  of  the  output  at  time  x  resulting  from  an 
input  of  unity  at  time  x  =0.  The  filter  coefficients  which  were  generated  by  the  LPF 
techniques  are  used  to  transform  a  single  value  of  unity  input  into  the  AL  output  values 
over  the  time  interval  the  filter  covers.  As  described  in  coordination  with  Equation  [4], 
the  filter  is  of  finite  length,  covering  the  time  lags  which  are  most  important  to 
approximate  the  actual  output  with  the  predicted  values. 

The  impulse  response  curves  shown  in  Figures  7a  and  7b  were  built  with  a 
constructed  input  time  series,  the  predetermined  filters,  and  the  IDSIM  command  from 
Matlab®  which  uses  the  filters  to  transform  the  input  into  the  output.  The  time  series  input 
consisted  of  an  hour  of  zero  input,  a  single  input  of  unity,  and  followed  by  a  time  series  of 
zero  inputs.  The  initial  zero  input  values  were  to  ensure  the  filter  was  stable  and 
responded  only  to  non-zero  inputs.  The  single  value  of  unity  allows  the  filter  to  respond 
to  a  unit  input  at  each  individual  time  lag.  The  response  from  the  filter  shows  the  impulse 
response  of  the  output.  Since  the  input  value  is  unity  and  only  zeroes  follow  it  in  the  time 
series,  the  impulse  response  is  also  the  magnitude  of  the  individual  filter  coefficients.  The 
zeroes  trailing  the  single  unit  input  allow  the  filter  to  show  how  the  individual  coefficients 
transform  the  normal  time  series  input  into  the  predicted  index  AL. 

In  order  for  the  exact  relationship  between  an  input  and  an  output  to  be  shown,  the 
impulse  response  curve  must  be  finite  in  length.  A  perfect  curve  would  go  to  zero  and 
remain  there  after  all  of  the  coefficients  contributing  to  the  output  were  used.  The 
“memory”  of  the  impulse  response  would  “recall”  the  events  that  have  taken  place  earlier 
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FigureTa.  Impulse  response  curve  forVBs. 


and  use  these  coefficients  until  the  physical  system  “forgot”  the  event  had  occurred.  In 
other  words,  the  original  event  would  have  moved  into  the  past  far  enough  that  the 
impulse  response  no  longer  viewed  it  as  a  contribution  to  the  current  state  of  geomagnetic 
activity.  The  system  memory  would  then  represent  energy  input,  energy  storage,  and 
eventually,  energy  dissipation. 

Figures  8a  and  8b  show  the  impulse  response  curves  for  the  convective  electric 
field  coupling  function,  VBs,  over  moderate  and  strong  geomagnetic  activity  regions 
(intervals  10  to  14  for  moderate  and  27  to  31  for  strong).  These  intervals  and  activity 
levels  chosen  because  they  represent  a  wide  range  of  possible  solar  wind  inputs  to  the 
models.  They  were  also  used  in  previous  studies  [Clauer,  1986]  and  could  be  compared 
to  them  for  accuracy  and  validation.  The  curves  were  created  through  correlation  and 
covariance  analysis  methods  using  the  Matlab®’s  CRA  command.  The  cross  correlation 
values  are  an  unsealed  measure  of  the  impulse  response.  Properly  scaled,  these  values 
return  the  impulse  response  curves  for  the  different  inputs. 

The  structured  curves  in  each  figure  represent  the  actual  filter  coefficients  for  the 
given  time  intervals  or  geomagnetic  activity  levels.  The  smooth  curves  in  Figures  8a  and 
8b  are  presented  as  general  guides  to  locate  general  peaks  and  minimums.  These  curves 
help  to  indicate  which  time  lags  in  the  filter  are  most  important  to  correctly  predicting  the 
output. 

By  examining  these  curves,  the  importance  of  past  values  of  the  input  becomes 
clear.  The  curves  in  figure  8a  show  two  predominant  peaks  at  approximately  12  and  30 
time  lags,  or  30  and  75  minutes,  respectively,  for  moderate  geomagnetic  activity.  The 
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VBs  filter  for  strong  activity  (5  interval  average) 
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FigureSb.  Impulse  Resp 

in  strong  geo 


rest  of  the  curve  decays  gradually  to  near  zero  values.  The  curves  in  figure  8b  shows  a 
single  general  peak  at  approximately  10  time  lags,  or  25  minutes  after  which  the  curve 
decays  towards  an  amplitude  of  zero.  The  peaks  in  both  figures  relate  the  importance  of 
events  at  these  earlier  to  the  current  values  of  the  output,  AL.  Analysis  of  the  mechanisms 
behind  the  peaks  will  take  place  in  Chapter  IV. 

These  curves  helped  to  develop  the  models  used  to  predict  the  magnetic  index  AL 
and  compare  it  to  the  measured  AL  in  the  Bargatze  data  set.  The  models  were  initially 
built  using  the  software’s  default  filter  length  (20  coefficients).  Once  the  new 
corresponding  impulse  responses  were  viewed,  the  filters  were  determined  to  be  far  too 
short  in  length  to  estimate  the  model  properly.  At  this  point  in  the  analysis,  the  models 
were  rebuilt  several  times  until  their  impulse  response  curves  showed  that  the  magnitude 
of  the  filter  coefficients  decayed  to  and  remained  near  values  of  zero.  The  drop  in  the 
amplitudes  of  the  impulse  response  indicated  that  the  current  filter  length  was  sufficient  to 
model  the  greatest  portions  of  the  variability  in  the  output.  Any  further  increase  in  the 
filter  length  would  not  benefit  the  predictive  quality  of  the  model  significantly,  and  would 
increase  the  computation  time  of  the  model  to  where  the  model  itself  would  no  longer  be 
useful. 

Frequency  Response 

The  frequency  response  is  a  time  domain  description  of  the  predictive  model.  This 
is  important  because  the  predicted  and  measured  outputs  must  correlate  well  at 
frequencies  corresponding  to  geomagnetic  storms  and  substorms  in  order  to  model 
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geomagnetic  activity.  Geomagnetic  storms  and  substorms  occur  on  frequency  scales  of 
llO"^  and  lower  and  to  MO"*  ”‘‘/sec,  respectively.  If  the  model  can  account  for 
changes  in  frequency  phase  and  amplitude  in  the  ranges  of  storms  and  substorms,  it  should 
predict  geomagnetic  activity  well.  The  higher  frequency  scales  correspond  to  individual 
data  measurements,  on  the  order  of  7- 10'^  ”**460  and  above.  If  the  predicted  frequencies 
match  the  phase  and  amplitude  of  the  measured  AL  index,  the  model  should  be  able  to 
predict  the  output  well  from  step  to  step. 

The  AL  frequency  response  derived  from  a  single  input,  VBs,  prediction  is 
compared  with  the  actual  AL  outputs  in  Figure  9.  The  upper  plot  in  Figure  9  is  the 
amplitude  comparison  of  the  actual  and  estimated  data  while  the  lower  plot  is  the  phase 
comparison.  The  frequency  phase  responses  of  the  two  models  correspond  well  in  the 
lower  frequency  regions,  indicating  that  the  predictive  model  should  do  well  in  predicting 
onset  and  duration  of  the  geomagnetic  activity  at  storm  and  substorm  levels.  The 
frequency  amplitude  responses  are  of  the  same  order  of  magnitude  for  the  lower 
frequencies  so  we  should  expect  that  the  predictive  model  will  tend  to  deviate  from  the 
actual  intensity  of  the  geomagnetic  activity  by  a  small  amount. 

As  the  frequency  response  function  moves  into  the  higher  frequency  regions 
(moving  to  the  right),  the  phase  responses  diverge.  This  would  indicate  that  the  model 
will  not  respond  closely  with  the  actual  AL  index  for  each  time  step.  However,  since  the 
general  shape  and  trend  of  the  curve  remain  similar  to  the  actual  phase,  the  model  will  not 
become  erratic  and  unstable.  The  amplitude  response  of  the  model  remains  very  close  to 
the  actual  amplitude  responses  of  AL.  The  amplitudes  of  the  model’s  responses  should, 
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therefore,  remain  relatively  close  to  those  of  the  measured  AL  regardless  of  the  frequency. 
The  errors  in  the  simulated  model  will  be  apparent  primarily  by  small  deviations  in 
amplitude  throughout  the  frequency  range  of  inputs  and  by  the  model’s  inability  to 
respond  exactly  as  the  actual  AL  index  does  for  the  individual  inputs. 

The  present  analysis  of  the  model  frequency  analysis  confirms  in  the  works  of 
Bargatze  [1985].  He  concluded  that  although  the  VBs  and  AL  time  series  contain  a  high- 
frequency  component,  the  high-frequency  components  are  not  correlated.  This  would 
indicate  the  magnetosphere  acts  as  a  low-pass  filter  for  the  solar  wind  variations  [Bargatze 
et  al.,  1985].  Since  the  correlations  for  the  multiple,  and  even  single,  input  models  in  high, 
the  high  frequency  components  of  the  time  series  do  not  add  significantly  to  the 
correlation  of  the  models. 
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IV.  Results  and  Analysis 


In  this  chapter,  the  results  obtained  from  this  research  will  be  examined  and 
compared  with  the  results  reported  by  Bargatze  et  al.  [1985],  Clauer  [1986],  and 
Vassiliadis  et  al.  [1995].  The  information  will  move  in  a  logical  fashion  from  the  basic 
theory  and  parameters  through  the  actual  analysis  to  the  attempts  at  forecasting.  The  next 
four  sections  will  discuss  comparisons  of  the  impulse  response  curves,  filter  and  model 
content,  the  residuals  and  correlations  for  the  different  models,  and  the  forecasting  results. 

Impulse  Response 

Bargatze’ s  paper  describes  the  development  of  a  series  of  filters  based  on  the 
geomagnetic  activity  level.  The  filters  are  actually  just  the  impulse  response  amplitudes  of 
the  output  to  the  input  for  a  given  time  interval.  To  show  gradual  change  between  the 
levels  of  activity,  he  averaged  each  over  a  five  interval  period.  This  stacked  plot  of  filters, 
depicted  in  Appendix  E,  shows  the  impulse  response  curve  for  a  zero  time  lag  and  a  filter 
order  corresponding  to  approximately  4  hours. 

Clauer  [1986]  used  a  great  deal  of  the  information  revealed  in  the  paper  by 
Bargatze.  His  work  expanded  on  the  work  of  Bargatze  by  using  other  inputs  and  outputs 
to  show  the  advantages  of  the  LPF  techniques. 

The  filter  amplitudes  show  patterns  which  vary  with  activity  levels.  At  the  lower 
activity  levels,  intervals  1  to  6,  the  filters  show  two  peaks  of  approximately  equal 
amplitude.  The  lags  for  these  peaks  are  roughly  at  20  and  60  minutes.  They  are 
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somewhat  variable,  taking  values  for  the  20  minute  peak  from  15  to  30  minutes  and  for 
the  60  minute  peak  from  55  to  70  minutes  [Bargatze,  1985:6390].  The  filters  gradually 
decay  to  zero  after  approximately  2  hours.  The  peaks  of  these  impulse  responses  reflect 
both  the  driven  and  unloading  model  equally,  showing  the  response  of  the  output  is  related 
to  two  separate  mechanisms. 

The  filters  from  6  to  18  also  have  two  peaks,  but  the  second  peak  is  significantly 
larger  than  the  first.  The  peaks  still  remain  near  the  20  and  60  minute  established  by  the 
first  six  filters.  This  range  of  filters  encompasses  the  moderate  range  of  activity,  and 
indicates  that  moderate  activity  is  more  closely  related  to  the  unloading  model,  where 
stored  energy  is  released  into  the  auroral  electrojets. 

Filters  19  to  30  are  much  different  from  the  previous  two  groups  of  filters.  They 
have  a  single  broad  peak  and  takes  longer  to  decay.  The  peaks  of  these  filters  reach  their 
maximum  amplitudes  near  the  20  minute  peak  of  the  first  two  groups.  The  peaks  are 
much  broader,  however,  and  may  cover  as  much  as  a  30  minute  interval.  This  indicates 
the  driven  model  dominates  at  strong  geomagnetic  activity  levels.  The  slow  decay  from 
the  peak  amplitude  is  completed  in  approximately  2.5  hours.  The  impulse  response  curve 
then  stays  near  the  zero  line,  but  remains  slightly  negative  until  approximately  3.5  hours  of 
lag  where  it  begins  to  rise  again. 

The  broadening  of  the  20  minute  peak  for  strong  geomagnetic  activity  levels  is  not 
explained  completely.  A  possible  explanation  is  that  although  the  peak  at  20  minutes 
indicates  the  driven  model  has  taken  precedence,  the  broadening  of  the  peak  would 
suggest  an  interaction  time.  The  solar  wind-magnetosphere  coupling  actions  do  not  take 
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place  immediately.  The  broadening  may  be  the  result  of  the  solar  wind’s  push  into  the 
magnetosphere,  and  the  magnetosphere’s  impedance  to  such  action. 


TIME  LAG  (hours) 


Figure  10.  Filters  10  and  27  for  the  Bargatze  data  set. 

(Adopted  from  Bargatze  [1985:6389]). 

Filters  10  and  27,  reproduced  in  Figure  [10],  have  been  highlighted  in  Bargatze’s 
paper  as  representative  filters  for  the  second  and  third  group  of  filters  mentioned  above. 
Filter  10  demonstrates  the  two  peak  curve  with  the  second  peak  being  higher.  Filter  10 
describes,  in  general,  the  levels  of  geomagnetic  activity  where  the  unloading  and  driven 
model  are  both  important  to  calculation  of  AL,  but  the  former  model  is  preeminent.  Filter 
27  illustrates  the  single  peak  curve,  associated  with  the  dominance  of  the  driven  model, 
with  a  maxima  near  the  20  minute  lag  and  decaying  gradually  thereafter. 
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Filter  and  model  components 


The  models  used  were  built  by  implementing  ARMA  techniques.  The  number  of 
previous  inputs  and  outputs  and  time  lags  had  a  great  deal  of  influence  on  the  ability  of  the 
model  properly  estimate  the  actual  output.  The  papers  mentioned  earlier  do  not  reference 
exact  values  of  orders  or  lags  to  incorporate  into  a  model  in  order  to  maximize  its 
performance.  Clauer  [1986:43]  suggests  three  courses  of  action  in  order  to  produce  the 
best  models  possible.  They  are  to  set  a  predetermined  length  for  the  model’s  filter,  preset 
a  minimum  least  squares  error  for  the  model  to  achieve,  or  continue  until  the  least  squares 
error  levels  out  and  shows  no  sign  of  any  appreciable  increase  or  decrease.  The  models 
used  in  this  thesis  have  set  the  orders  and  lags  which  appear  to  give  the  best  performance. 
For  single  and  multiple  input  models,  the  best  model  structure  is  [10  85  1],  where  10  is  the 
number  of  previous  outputs  used,  85  the  number  of  previous  input  data  points,  and  1 
represents  the  time  lag. 

Figures  1 1  and  12  show  several  intervals  where  strong  activity  starts  suddenly. 

The  model  does  not  respond  quickly  to  and  may  miss  some  onsets  of  geomagnetic 
activity.  This  was  also  noted  by  Clauer  [1986],  so  it  is  not  specific  to  this  effort  and  could 
be  considered  as  a  uncertainity  in  the  physical  system  or  prediction  method. 

Multiple  input  models  are  slightly  different  than  the  single  input  models.  The 
impulse  response  and  model  coefficients  for  multiple  input  are  created  with  all  of  the 
inputs  at  once.  Therefore,  their  impulse  responses  and  coefficients  are  very  different  from 
those  created  using  single  input  models.  In  general,  the  impulse  response  curves  for  the 
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multiple  input  models  are  much  smoother  than  the  single  input  models,  and  each  input 
affects  the  impulse  response  curve  differently. 

Vassiliadis’  [1995]  state-input  space  uses  several  inputs  and  outputs  in  developing 
the  model,  and  has  a  different  approach  to  modeling  the  time  series  compared  to  LPF 
techniques.  The  state-input  space  model  does  not  relate  to  filter  estimation  or  coefficients. 
The  conditions  for  best  fitting  a  model  come  fi'om  an  estimate  of  distance  from  a  specific 
point  in  three  dimensional  space.  This  method  determines  what  level  of  fit  is  required  by 
the  model,  and  chooses  a  distance  from  the  reference  point  which  encompasses  enough 
points  to  give  this  correlation.  The  shorter  the  distance  fi'om  the  reference  point  to  the 
other  points  needed  to  achieve  a  certain  correlation  the  better  the  model.  The  distance  can 
shortened  to  any  minimum  length,  so  long  as  there  is  at  least  one  neighbor  within  that 
distance.  The  state-input  space  model  cannot  perform  its  calculations  without  at  least  one 
nearest  neighbor.  For  Vassiliadis’  2.5  hour  prediction,  use  of  the  three  nearest  neighbors 
in  state-space  produced  a  model  with  an  average  correlation  coefficient  of  92%. 

Residuals  and  correlations 

The  comparison  of  residuals  shows  the  difference  between  the  predicted  and  actual 
values  for  AL.  Bargatze  et  al.  [1985]  and  Vassiliadis  et  al.  [1995]  both  used  residuals  as  a 
method  of  establishing  the  quality  of  their  results.  The  results  of  the  predicted  values  of 
AL  and  the  residuals  from  this  thesis  are  seen  in  Figures  1 1  and  12.  The  units  of  the 
residuals  are  the  same  as  the  AL  index  (nT)  since  they  are  only  the  difference  between  the 
actual  and  simulated. 
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Residuals  attained  were  smaller  in  overall  magnitude  than  those  produced  by 
Bargatze  [1986],  Bargatze  showed  only  the  results  of  the  model  using  VBs  as  model 
input  to  simulate  AL.  As  shown  in  his  paper,  the  residuals  for  VBs  over  moderate 
geomagnetic  activity  levels  reached  maximum  amplitudes  near  300  nT.  Using  the  same 
input  and  interval  of  activity,  the  residuals  of  this  study  reach  an  amplitude  slightly  less 
than  those  of  Bargatze,  approximately  250  nT.  This  indicates  that  this  model  was  better 
able  to  fit  the  data  and  has  a  higher  correlation  since,  in  areas  of  low  residual  amplitudes, 
the  two  results  are  nearly  equal. 

The  results  at  stronger  geomagnetic  activity  were  also  superior  to  those  reported 
by  Bargatze,  when  evaluated  in  terms  of  the  magnitude  of  the  residuals.  Again,  over  the 
same  time  interval  and  input,  this  study’s  residuals  did  not  exceed  the  amplitude  of 
400  nT.  In  contrast.  Bargatze’ s  results  showed  a  maximum  residual  of  600  nT  with 
several  peaks  over  the  400  nT  mark.  Both  sets  of  residuals  show  a  good  deal  of  departure 
from  the  actual  AL  with  the  greatest  differences  occurring  at  the  onset  of  strong  activity. 
This  would  indicate  that  the  models  did  not  respond  well  to  sudden  increases  in  the 
activity  level.  The  models  did,  however,  do  reasonably  well  once  the  model  noticed  the 
increase  in  activity. 

The  results  of  other  single  input  models  and  multiple  input  models  for  moderate 
and  strong  geomagnetic  activity  appear  in  Appendix  C.  The  models  with  more  than  one 
input  show  several  input  graphs,  a  predicted  versus  actual  AL  graph,  and  a  graph  of  the 
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VBs  values  for  moderate  activity 


(moderate  activity). 


VBs  values  for  strong  activity 


(strong  activity). 


residuals.  The  combinations  of  coupling  functions  and  levels  of  geomagnetic  activity  can 
be  determined  by  the  titles  of  each  input  graph.  The  figures  are  numbered  and  have  full 
title  descriptions  of  the  input  or  inputs,  simulated  versus  actual  AL,  and  the  residuals. 

The  residuals  are,  in  general,  smaller  for  multiple  input  models.  The  reason  for  this 
is  using  more  than  one  input  allows  for  modeling  multiple  reactions  of  the  magnetosphere 
which  have  linear  contributions  to  AL.  If  two  or  more  physical  processes  are  causing  the 
growth  or  diminishment  of  AL,  models  using  inputs  which  involve  all  of  these  processes 
would  have  a  better  chance  of  modeling  AL  than  a  model  which  only  involved  a  single 
process.  This  can  also  be  seen  in  the  increase  of  the  correlation  for  multiple  input  models. 
There  is  an  approximate  5%  increase  in  the  overall  correlation  for  each  model  that 
incorporates  an  additional  input  in  predicting  AL.  If  a  model  uses  three  coupling  functions 
as  input,  the  correlation  can  be  expected  to  increase  by  approximately  10%. 

Vassiliadis’  work  shows  a  remarkable  increase  in  correlation  when  compared  to 
any  of  the  models  which  used  LPF  techniques.  His  average  correlation  values  were  near 
94%,  while  those  of  Bargatze  averaged  between  50%  and  65%  [Vassiliadis,  1995],  This 
improvement  may  not  be  solely  due  to  Vassiliadis’  use  of  state-input  space  models,  but 
also  lie  in  the  fact  that  he  used  multiple  inputs.  If  this  is  the  case,  linear  modeling  may  be 
sufficient  to  produce  excellent  results,  given  that  the  coupling  functions  have  some  type  of 
linear  relationship  to  AL  or  any  other  geomagnetic  index. 

The  correlation  between  predicted  and  observed  values  of  the  AL  index  for  various 
models  is  presented  in  Figure  13.  The  results  displayed  include  Bargatze’ s  single  input, 
this  study’s  single  and  two  multiple  input,  and  Vassiliadis’ s  state-input  space  correlation 
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values.  In  Appendix  D,  the  correlation  values  reported  are  those  of  Bargatze  as  reported 
by  Vassiliadis,  the  single  input  model  for  VBs,  two  multiple  input  models,  and  the  state- 
input  space  model  used  by  Vassiliadis  et  al  [1995],  It  is  easy  to  see  that  the  state-input 
space  model  “scored”  the  best  of  the  models  tested.  The  explanation  for  the  improved 
performance  is  the  nonlinear  model  was  able  to  interpret  relations  which  the  linear  models 
could  not,  and  multiple,  instead  of  single,  inputs  were  used.  As  seen  by  both  the  linear  and 
nonlinear  models,  an  increased  number  of  inputs  allows  the  models  to  achieve  a  greater 
correlation  and  accuracy  compared  to  single  input  models. 

Vassiliadis  et  al.  [1995]  reported  that  linear  prediction  filtering  averaged  50-60% 
for  cross  correlations  using  single  inputs.  The  single  input,  LPF,  models  designed  in  this 
thesis  achieved  correlations  in  this  range.  However,  the  multiple  input  models  show  a 
significant  improvement  over  the  single  input  correlations.  The  triple  input  LFP  model  did 
quite  well  with  values  averaging  in  the  middle  to  upper  70%  range.  It  is  apparent  that  the 
next  generation  of  models  will  need  to  include  both  the  adaptive  and  nonlinear  features  as 
well  as  multiple  inputs  to  perform  more  successfully  than  those  illustrated  here. 

Forecasting 

The  method  of  forecasting  AL  is  very  similar  to  the  predictive  method  used  in  all 
of  the  models  described  above.  The  models  predict  the  next  value  of  AL  from  the 
coupling  function  inputs  and  the  previous  values  of  AL.  In  order  to  extend  this  forecast 
beyond  a  single  time  step,  different  techniques  are  available. 
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Correlation  (%) 
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Figure  13.  Correlatio 


Two  types  of  forecasting  were  implemented  in  this  thesis.  The  first  was  an 
extended  single  step  prediction  which,  in  effect,  was  the  same  as  the  prediction  of  AL 
produced  by  the  standard  LPF  models.  The  distinguishing  feature  was  that  the  predicted 
output  was  now  treated  as  the  true,  observed  value  with  autoregressive  feedback  being 
now  driven  by  a  prediction.  The  input  data  stream,  solar  wind  data,  was  assumed  to  still 
be  available  at  the  2.5  minute  intervals.  The  second  forecast  type  assumed  that  both  the 
observed  outputs  and  inputs  were  not  available  over  the  forecast  interval.  Operationally, 
when  those  conditions  are  encountered,  persistance  of  the  output  is  the  standard  forecast. 
However,  improving  on  this,  within  the  LPF  approaches,  requires  that  inputs  be  supplied. 
The  remedy  adapted  was  input  persistance. 

The  basis  of  the  input  persistance  method  is  the  distributed  nature  of  the  impulse 
response.  Previously,  we  established  that  significant  contributions  to  the  impulse  response 
extended  over  a  time  interval  in  excess  of  90  minutes,  with  “peak”  contributions  at  lagged 
times  of  25  and  70  minutes.  Forecasts  using  only  lagged  input  values  in  the  30  to  70 
minute  sector  of  the  true  input  data,  for  example,  would  capture  a  significant  portion  of 
the  impulse  response.  Filling  the  remaining  interval,  if  the  fill  is  chosen  appropriately,  will 
improve  the  prediction  and  enable  a  time  interval  forecast,  tf  (30  minutes  in  this  example). 
The  implementation  of  input  persistance  thus  consists  of  repeating  or  duplicating  the  input 
data  (persistance)  over  a  time  interval  for  forecast,  tf,  and  using  this  current  data,  from 
current  time  t  through  t-tf,  as  the  future  data  for  t  to  t  +  tf. 
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These  forecast  methods  produced  results  which  were  better  correlated  than 
persistance  but  less  accurate  than  the  model’s  prediction  at  short  time  increments.  The 
results  showed  residuals  ranging  from  175  to  400  nT  for  low  to  moderate  activity  for  both 
models  and,  for  strong  activity,  ranging  from  500  to  800  nT  for  the  single  step  method  and 
500  to  950  nT  for  the  input  persistance  method.  The  cross  correlation  values  between  the 
forecast  and  actual  values  of  AL  for  both  methods  ranged  from  0.69  at  low  levels  to  0.35 
for  strong  geomagnetic  activity.  On  average,  the  single  step  method  performed  between 
0.03  and  0.06  better  than  the  input  persistance  method. 

Vassiliadis  et  al.  [1995]  predicted  AL  in  single  steps  out  to  the  desired  forecast 
time.  Using  each  new  predicted  value  of  AL  as  a  previous  value,  he  then  predicted  the 
next  value  of  AL  using  the  actual  inputs  and  their  nearest  neighbors  in  state-input  space. 

In  this  approach,  the  two  limitations  on  single  step  predictions  are  1)  the  amount  of  noise 
in  the  data  and  the  determinism  of  the  coupling  and  2)  the  amount  of  available  data  in  the 
region  of  interest  in  the  state-input  space  [Vassiliadis  et  al.,  1995:3498].  As  seen  in  Figure 
13,  the  correlations  of  the  2.5  minute  prediction  were  the  highest,  by  far,  of  all  of  the 
models.  The  values  for  the  4. 17  hour  prediction  model  were  lower  but  in  the  same  range 
as  the  dual  and  triple  input  models  for  linear  prediction  filtering. 
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V.  Conclusions  and  Recommendations 


This  chapter  will  be  concise  as  it  contains  only  two  sections.  The  first  will  assess 
the  models  developed,  discuss  limitations  and  achievements,  and  compare  them  to 
previous  work  in  the  field.  The  second  section  will  introduce  directions  to  follow  in  order 
to  further  this  research  and  suggest  even  better  methods  for  predicting  the  desired  indices. 

Conclusions 

There  are  several  conclusions  which  can  be  drawn  from  the  results  of  this  research. 
Some  of  them  include  the  use  of  LPF  in  prediction  and  forecasting,  the  accuracy  of  LPF  in 
comparison  to  other  linear  and  nonlinear  models,  and  the  limitations  of  the  LPF  in 
forecasting. 

The  techniques  of  linear  prediction  filtering  represent  a  useful  tool  and  can  be  used 
to  develop  good  models  for  predicting  geomagnetic  activity.  The  impulse  response  curves 
for  the  different  coupling  functions  provide  physical  insight  regarding  characteristic  system 
times  and  enable  assessment  of  physical  mechanisms  and  models.  Multiple  input  models 
allow  the  LPF  techniques  to  achieve  improved  correlation  scores  and  lower  residuals  when 
compared  to  single  input  models.  The  effectiveness  of  these  LPF  approaches,  however, 
has  already  been  surpassed  by  the  use  of  state-input  space  models.  Other  techniques 
capable  of  addressing  nonlinear  systems  becoming  available,  such  as  neural  networks, 
should  be  used  to  examine  the  physical  processes  of  solar  wind-magnetosphere  coupling. 
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LPF  methods  developed  in  this  effort  showed  good  comparisons  to  the  results  of 
the  previous  linear  methods  as  well  as  the  state-input  space  models.  The  past  linear 
models  were  able  to  achieve  correlations  of  approximately  60  with  the  actual  values  of  AL 
while  the  linear  models  of  this  thesis  reach  an  average  78  using  three  inputs.  The  state- 
input  space  model,  also  using  three  inputs,  achieved  much  better  results,  nearly  94%  of  the 
actual  AL  values.  The  LPF  methods  results  are  excellent  for  linear  models,  but  it  can  be 
seen  that  nonlinear  methods  should  be  employed  to  secure  the  best  results. 

An  obvious  limitation  of  LPF  techniques  is  they  model  only  the  linear  processes 
which  take  place.  It  is  clear  that  some  of  the  processes  in  the  solar  wind-magnetosphere 
coupling  are  nonlinear.  Therefore,  nonlinear  models  are  the  best  equipped  to  model  and 
forecast  them. 

Recommendations 

There  are  several  directions  to  take  to  continue  this  research.  They  involve  using 
different  types  of  models,  applying  these  methods  to  different  indices,  and  using  other 
coupling  functions  discussed  by  Gonzalez  [1990]  to  determine  their  importance  with 
respect  to  all  of  the  diflFerent  indices.  Such  investigations  will  provide  substantial  progress 
for  the  field  of  predicting  the  geomagnetic  indices  and  a  better  understanding  of  the 
processes  taking  place  during  the  coupling  of  the  solar  wind  and  magnetosphere. 

Application  of  adaptive  filtering  techniques  to  nonlinear  methods  is  a  good  place  to 
start  additional  research.  Nonlinear  methods  should  show  a  greater  correlation  between 
the  solar  wind  input  and  the  resulting  indices.  In  the  last  several  years,  neural  networks 
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have  been  experimented  with  and  refined  to  a  degree  which  suggests  their  use  would  be 
beneficial  in  the  arena  of  forecasting.  They  are  able  to  model  linear  as  well  as  nonlinear 
effects  in  a  continuous  time  domain.  This  makes  them  very  applicable  to  the  prediction 
and  forecasting  of  geomagnetic  indices.  Their  application  to  this  field  should  be  studied, 
applied,  and  exploited. 

The  different  indices  represent  varied  effects  taking  place  within  the  overall 
coupling  between  the  solar  wind-magnetosphere  system.  By  modeling  each  of  the  indices 
with  adaptive  filtering  techniques,  understanding  can  be  gained  for  both  the  physical 
actions  taking  place  and  the  relative  linear-nonlinear  relationship  between  the  indices  and 
the  solar  wind  input.  By  using  different  coupling  functions  in  the  prediction  of  different 
indices,  new  indications  of  the  processes  taking  place  may  be  revealed. 
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APPENDIX  A:  Data  Formats 


Data  Format  for  Bargatze  Data  Set 

The  Bargatze  data  set  is  a  matrix  which  is  42216  x  14  in  dimension.  The  number 
of  rows  (42216)  is  the  number  of  2.5-minute  time  steps  in  the  interval.  The  numbers  of 
columns  (14)  are  the  types  of  data  in  the  set  itself  Below,  listed  by  column  number,  are 
the  data  which  are  included  in  the  Bargatze  set: 


1.  TIME  in  YYMMDDSSSSS  format  where  YY  is  the  two  digit  year  (i.e.  67 
is  1967),  MM  is  the  month,  DD  is  the  day  of  month,  and  SSSSS  is  the 
seconds  of  day  which  ranges  between  00000  and  86250  for  this  2.5 
minute  data  set. 

2.  AE  Index  in  nT 

3.  AL  Index  in  nT 

4.  X  coordinate  of  IMP  8  in  GSM  coordinates  and  in  km 

5.  Y  coordinate  of  IMP  8  in  GSM  coordinates  and  in  km 

6.  Z  coordinate  of  IMP  8  in  GSM  coordinates  and  in  km 

7.  Bx  in  GSM  coordinates  and  in  nT 

8.  By  in  GSM  coordinates  and  in  nT 

9.  Bz  in  GSM  coordinates  and  in  nT 

10  .  B  sigma  squared  in  (nT)**2,  a  measure  of  the  magnetic  field  variance 

11.  Np,  the  proton  number  density  in  #/cm**3 

12.  V,  the  solar  wind  bulk  speed  in  km/s 

13.  Na,  the  alpha  particle  number  density  in  #/cm**3 

14.  Tp,  the  proton  temperature  in  degrees  K 
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Data  Format  for  Solar  Wind  Experiment  (SWE)  data 


The  following  format  statement  can  be  used  to  read  the  SWE  data  into  a  Fortran 
program.  The  data  itself  is  an  XX  x  24  matrix.  The  length  is  dependent  on  how  long  a 
window  remains  open  while  receiving  the  data.  The  data  in  each  column  is  described  in 
the  list  below. 


Fortran  Code 


READ(1,100)  IYRDAY,UTIME,MILLSEC,DATA(I), 1=1,20) 
100  FORMAT(I6,1X,A12,I9,20F12.3) 


Column  listing 


1 

GSMVX 

KM/S 

2 

GSMVY 

KM/S 

3 

GSM  VZ 

KM/S 

4 

GSM  VT 

KM/S 

5 

GSM  VTHETA 

DEG 

6 

GSM  VPHI 

DEG 

7 

GSE  VX 

KM/S 

8 

GSE  VY 

KM/S 

9 

GSE  VZ 

KM/S 

10 

GSE  VTHETA 

DEG 

11 

GSE  VPffl 

DEG 

12 

DISTANCE 

RE 

13 

GSMRX 

RE 

14 

GSMRY 

RE 

15 

GSMRZ 

RE 

16 

GSERX 

RE 

17 

GSERY 

RE 

18 

GSERZ 

RE 

19 

ATEMP 

DEGK 

20 

ADENS 

P/CM3 

WIND  KP  DATA:  SWE  GSM  VX 
WIND  KP  DATA:  SWE  GSM  VY 
WIND  KP  DATA:  SWE  GSM  VZ 
WIND  KP  DATA:  SWE  GSM  VT 
WIND  KP  DATA:  SWE  GSM  VTHETA 
WIND  KP  DATA:  SWE  GSM  VPHI 
WIND  KP  DATA:  SWE  GSE  VX 
WIND  KP  DATA:  SWE  GSE  VY 
WIND  KP  DATA:  SWE  GSE  VZ 
WIND  KP  DATA:  SWE  GSE  VTHETA 
WIND  KP  DATA:  SWE  GSE  VPHI 
WIND  DIST:  S/C  DISTANCE 
WIND  GSM  RX:  S/C  POSITION 
WIND  GSM  RY:  S/C  POSITION 
WIND  GSM  RZ:  S/C  POSITION 
WIND  GSE  RX:  S/C  POSITION 
WIND  GSE  RY:  S/C  POSITION 
WIND  GSE  RZ:  S/C  POSITION 
WIND  KP  DATA:  SWE  PROTON  TEMP 
WIND  KP  DATA:  SWE  PROTON  DENS 
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Data  Format  for  Magnetic  Field  Instrument  (MFH  data 


The  following  format  statement  can  be  used  to  read  the  MFI  data  into  a  Fortran 
program.  The  data  itself  is  an  XX  x  20  matrix.  The  length  is  dependent  on  how  long  a 
window  remains  open  while  receiving  the  data.  The  data  in  each  column  is  described  in 
the  list  below. 


Fortran  Code 


READ(1, 100)  IYRDAY,UTIME,MILLSEC,DATA(I),I=1, 18) 
100  F0RMAT(I6,1X,A12,I9,18F12.3) 


Column  listing 


1 

GSMBX 

NT 

2 

GSM  BY 

NT 

3 

GSMBZ 

NT 

4 

GSMBT 

NT 

5 

GSM  BTHETA 

DEG 

6 

GSM  BPHI 

DEG 

7 

GSE  BX 

NT 

8 

GSE  BY 

NT 

9 

GSE  BZ 

NT 

10 

GSE  BTHETA 

DEG 

11 

GSE  BPm 

DEG 

12 

DISTANCE 

RE 

13 

GSMRX 

RE 

14 

GSMRY 

RE 

15 

GSMRZ 

RE 

16 

GSERX 

RE 

17 

GSERY 

RE 

18 

GSERZ 

RE 

WIND  KP  DATA:  GSM  BX 
WIND  KP  DATA:  GSM  BY 
WIND  KP  DATA:  GSM  BZ 
WIND  KP  DATA:  GSM  BT 
WIND  KP  DATA:  GSM  BTHETA 
WIND  KP  DATA:  GSM  BPHI 
WIND  KP  DATA:  GSE  BX 
WIND  KP  DATA:  GSE  BY 
WIND  KP  DATA:  GSE  BZ 
WIND  KP  DATA:  GSE  BTHETA 
WIND  KP  DATA:  GSE  BPHI 
WIND  DIST:  S/C  DISTANCE 
WIND  GSM  RX:  S/C  POSITION 
WIND  GSM  RY:  S/C  POSITION 
WIND  GSM  RZ:  S/C  POSITION 
WIND  GSE  RX:  S/C  POSITION 
WIND  GSE  RY:  S/C  POSITION 
WIND  GSE  RZ:  S/C  POSITION 
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(From  Dr.  Klimas,  Personal  Correspondence) 


Interval  number 


Number  of  Points 


(in  Interval) 


1 

1-660 

2 

660-1596 

3 

1596-2592 

4 

2592-3264 

5 

3264-4068 

6 

4068-4956 

7 

4956-5940 

8 

5940-7212 

9 

7212-7764 

10 

7764-9420 

11 

9420-10452 

12 

10452-11064 

13 

11064-12324 

14 

12324-13008 

15 

13008-14532 

16 

14532-15720 

17 

15720-17160 

18 

17160-18564 

19 

18564-19596 

20 

19596-20532 

21 

20532-21708 

22 

21708-22836 

23 

22836-24012 

24 

24012-24912 

25 

24912-26124 

26 

26124-27876 

27 

27876-29796 

28 

29796-31812 

29 

31812-34476 

30 

34476-37104 

31 

37104-38148 

32 

38148-39660 

33 

39660-41460 

34 

41460-42216 

0-27.5 

660 

27.5-66.5 

936 

66.5-108.0 

996 

108.0-136.0 

672 

136.0-169.5 

804 

169.5-206.5 

888 

206.5-247.5 

984 

247.5-300.5 

1272 

300.5-323.5 

552 

323.5-392.5 

1656 

392.5-435.5 

1032 

435.5-461.0 

612 

461.0-513.5 

1260 

513.5-542.0 

684 

542.0-605.5 

1524 

605.5-655.0 

1188 

655.0-715.0 

1440 

715.0-773.5 

1404 

773.5-816.5 

1032 

816.5-855.5 

936 

855.5-904.5 

1176 

904.5-951.5 

1128 

951.5-1000.5 

1176 

1000.5-1038.0 

900 

1038.0-1088.5 

1212 

1088.5-1161.5 

1752 

1161.5-1241.5 

1920 

1241.5-1325.5 

2016 

1325.5-1436.5 

2664 

1436.5-1546.0 

2628 

1546.0-1589.5 

1044 

1589.5-1652.5 

1512 

1652.5-1727.5 

1800 

1727.5-1759.0 

756 
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Bargatze  Data  Set  Components 

[Means  of  the  parameters  used  in  computing  the  various  models] 


(All  calculations  were  made  using  double  precision  accuracy.) 


Interval 

AL  Index 

Bs 

VBs 

V^2BS 

1 

-15.227273 

0.20492577 

67.0471714 

21936.3493 

2 

-25.611526 

0.51431242 

183.263214 

65301.565 

3 

-32.019057 

0.19343149 

64.947771 

21807.2714 

4 

-33.46211 

0.15687739 

74.9729211 

35830.1405 

5 

-39.086957 

0.23949521 

97.4044936 

39615.1359 

6 

-30.56243 

0.41618896 

174.740152 

73366.0032 

7 

-37.17868 

0.70004449 

254.090154 

92225.2897 

8 

-47.568735 

0.93924617 

303.079525 

97798.8533 

9 

-35.799277 

0.2955825 

117.414416 

46640.6008 

10 

-41.629451 

0.43832874 

148.41389 

50251.5135 

11 

-45.03485 

0.39354822 

178.741892 

81181.0653 

12 

-50.587276 

0.168572 

87.3595636 

45272.6046 

13 

-69.458366 

0.63473358 

280.656093 

124095.913 

14 

-66.252555 

0.98601463 

353.110321 

126455.425 

15 

-78.161967 

0.99907416 

407.318642 

166062.224 

16 

-89.783011 

0.88939006 

389.951065 

170973.166 

17 

-106.84178 

0.67115191 

319.716843 

152303.612 

18 

-101.59359 

0.61629239 

319.185418 

165310.058 

19 

-96.736689 

1.56116573 

657.623398 

277016.415 

20 

-83.116329 

1.11901064 

490.158559 

214703.421 

21 

-128.42991 

2.46372657 

893.24006 

323849.982 

22 

-102.69619 

1.20490582 

579.212238 

278434.059 

23 

-133.12234 

1.72903147 

709.895004 

291464.282 

24 

-150.37292 

1.42075298 

640.076672 

288366.909 

25 

-152.65293 

1.06904252 

620.804525 

360507.887 

26 

-186.28694 

0.67888426 

484.125901 

345239.835 

27 

-204.09058 

0.86835436 

556.750409 

356963.739 

28 

-218.17353 

0.9823384 

660.402298 

443972.457 

29 

-227.37186 

1.02337076 

633.85389 

392595.499 

30 

-231.77025 

1.73080193 

935.914742 

506087.029 

31 

-247.61818 

2.0296073 

1195.09673 

703710.617 

32 

-289.64706 

1.394705 

831.035612 

495172.948 

33 

-287.82954 

1.44331395 

883.076023 

540300.509 

34 

-413.08851 

3.50311121 

1825.23298 

951004.755 

Epsilon 

573.915282 

1645.57648 

1371.04054 

1455.70048 

1725.42281 

2078.94568 

2126.45236 

2955.85545 

1972.59965 

1536.57794 

3791.00373 

3042.06145 

4335.40463 

5066.58246 

4294.00479 

4587.83306 

5962.70139 

4757.23215 

8385.39901 

4539.01254 

14215.3971 

8913.44441 

9212.53151 

7087.67384 

9957.54564 

6088.12426 

8528.96208 

10818.8573 

13397.5192 

15770.8044 

15817.4818 

8053.17499 

7390.91338 

34588.5883 
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Appendix  C:  Simulation  of  AL  using  various  models 


The  following  figures  are  numbered  and  included  as  references.  Each  has  one  or 
more  titles  and  descriptions  to  allow  the  reader  a  better  understanding  of  what  is  contained 
and  how  they  would  be  applicable. 
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(moderate 


(strong  activity) 


V^2Bs  values  for  moderate  activity 


ate  activity). 
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Epsilon  values  for  moderate  activity 


(moderate  activity). 
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Bs  values  for  moderate  activity 
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Figure  C7.  Multiple  irput  model  forBsandVBs. 

(moderate  activity). 


Bs  values  for  strong  activity 


(strong  activity). 


Bs  values  for  moderate  activity 


80 


Bs  values  for  strong  activity 


81 


and  Epsilon  (strong  activity). 


VBs  values  for  moderate  activity 


derate  activity). 


VBs  values  for  strong  activity 
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VBs  values  for  moderate  activity 


84 


L 


VBs  values  for  strong  activity 


and  Ep  silo 


V^2Bs  values  for  moderate  activity 


andEpsilon  (moderate  activity) 


\/'^2Bs  values  for  strong  activity 


and  Epsilon  (strong  activity). 


Appendix  D:  Correlation  Values 


Correlations  between  input  and  AL  for  models  used 


Bargatze 

Linear  Prediction  Filtering 

Vassiliadis 

Single 

Single 

Dual 

Triple 

State- 

Input 

Interval 

Input 

Input 

Input 

Input 

Space 

1 

68 

68 

82 

87 

84 

2 

79 

80 

87 

89 

91 

3 

43 

45 

49 

62 

86 

4 

0 

31 

51 

67 

79 

5 

42 

45 

61 

76 

88 

6 

59 

59 

66 

69 

93 

7 

64 

65 

71 

79 

96 

8 

65 

65 

74 

83 

92 

9 

44 

44 

75 

87 

91 

10 

79 

79 

82 

81 

93 

11 

65 

65 

72 

91 

91 

12 

63 

63 

72 

78 

88 

13 

61 

65 

71 

75 

94 

14 

87 

87 

91 

91 

97 

15 

73 

74 

80 

83 

93 

16 

76 

76 

82 

84 

95 

17 

68 

68 

73 

84 

94 

18 

61 

62 

70 

74 

84 

19 

64 

59 

55 

80 

96 

20 

74 

74 

77 

79 

95 

21 

84 

84 

87 

89 

96 

22 

57 

58 

70 

74 

96 

23 

60 

60 

72 

78 

95 

24 

79 

79 

84 

87 

96 

25 

48 

64 

70 

74 

92 

26 

54 

56 

66 

69 

90 

27 

81 

81 

82 

83 

95 

28 

73 

73 

78 

80 

94 

29 

60 

60 

66 

73 

94 

30 

68 

68 

75 

76 

93 

31 

84 

84 

86 

66 

93 

32 

73 

74 

76 

86 

94 

33 

72 

73 

73 

75 

94 

34 

72 

72 

82 

86 

92 

The  values  listed  are  percentages  of  one,  where  100  indicates  a  perfect  or  complete 
correlation  with  the  output  data. 


88 


Appendix  E;  Bargatze’s  Stack  Plot  of  Filters 


-10  1  2  3  4 


TIME  LAG  (hours) 

Figure  El.  Stack  plot  of  linear  prediction  filters  for  all  levels  of  geomagnetic  activity 
(Adapted  from  Bargatze  [1985:6389]) 
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